Importing raw Bruker NMR

csm_import_raw_nmr is a new import system that imports, calibrates and interpolates spectra, and will import sample metadata and run basic analytical QC checks on the data.

Usage is either with dialog boxes:

>> imported = csm_import_raw_nmr()
Or with specifying the major file paths in a struct:
>> ex = struct;
ex.sample_metadata_path = strcat( csm_settings.getValue('toolbox_path') , filesep, 'csm_tools', filesep, 'misc', filesep, 'example_data', filesep, 'raw_bruker', filesep, 'sample_metadata.xlsx');
ex.nmr_experiment_info_path = strcat( csm_settings.getValue('toolbox_path') , filesep, 'csm_tools', filesep, 'misc', filesep, 'example_data', filesep, 'raw_bruker', filesep, 'NMR_experiment_info.xlsx');
ex.nmr_calibration_info_path = strcat( csm_settings.getValue('toolbox_path') , filesep, 'csm_tools', filesep, 'misc', filesep, 'example_data', filesep, 'raw_bruker', filesep, 'NMR_calibration_info.xlsx');
ex.experiment_path = strcat( csm_settings.getValue('toolbox_path') , filesep, 'csm_tools', filesep, 'misc', filesep, 'example_data', filesep, 'raw_bruker' );
imported = csm_import_raw_nmr('experiment_info', ex);

General info

When importing, the system import sample metadata, run the preprocessing steps csm_calibrate_nmr and csm_interpolate_nmr, and run QC and handle any errors.

Specifications for how the system preprocesses the data are specified in the the required files.

Mixed sample types are allowed - so if you have racks with mixed sample types, just specify them in the calibration file. When imported, they will be imported into seperate csm_spectra objects, per sample type.

Requirements

experiment_path

The experiment_path is the full path to the parent folder which contains the raw Bruker data.

sample_metadata_path

The sample_metadata_path is the full path to the file which contains the metadata for the sample.

nmr_experiment_info_path

The nmr_experiment_info_path is the full path to the file which contains the mappings to the raw Bruker data.

nmr_calibration_info_path

The nmr_calibration_info_path is the full path to the file which contains the data necessary for spectral calibration.

For examples of this file, please look in csm_tools/misc/example_data/raw_bruker

Either enter these into the experiment_info struct when calling the import system, or select them during the dialog prompts.

sample_metadata_path

The sample_metadata.xlsx file contains sample metadata about the experiment. These fields will be imported into the csm_nmr_spectra objects.

Sample IDFurther Sample infoCase-control pairSampling DateSampling ProtocolShip orderAgeGender
Sample_1Case151820103126963M
Sample_2Control151820103127063M
Sample_3Case216820103128063M
Sample_4Control216820103129063M

Once imported and assigned to the variable spectra (see Output), the metadata is saved inside the sample_metadata object.

>> spectra.sample_metadata

ans =

  csm_sample_metadata with properties:

                entries: [4x1 containers.Map]
    dynamic_field_names: {}
               filename: '/Users/ghaggart/workspace/matlab/IMPaCTS/csm_tools...'

% View the entry keys
>> keys(spectra.sample_metadata.entries)

ans =

    'Sample_2'    'Sample_1'    'Sample_4'    'Sample_3'

% Get a specific sample_metadata_entry
>> sample_metadata_entry = spectra.sample_metadata.entries('Sample_2')

sample_metadata_entry =

  csm_sample_metadata_entry with properties:

         sample_id: 'Sample_2'
    dynamic_fields: [7x1 containers.Map]

% Get the keys for the dynamic fields (column headers)
>> keys(sample_metadata_entry.dynamic_fields)

ans =

  Columns 1 through 5

    'Age'    'Case-control pair'    'Further Sample info'    'Gender'    'Sampling Date'

  Columns 6 through 7

    'Sampling Protocol'    'Ship order'

% Get the value for column
>> sample_metadata_entry.dynamic_fields('Age')

ans =

    63
    

Sample metadata can also be accessed as a table by using:

>> sample_table = spectra.sample_metadata.getTable()

sample_table =

    Sample_ID     Case_control_pair    Further_Sample_info    Gender    Sampling_Date    Sampling_Protocol    Ship_order
    __________    _________________    ___________________    ______    _____________    _________________    __________

    'Sample_2'    1518                 'Control'              'M'       2010             3                    1270
    'Sample_1'    1518                 'Case'                 'M'       2010             3                    1269
    'Sample_4'    2168                 'Control'              'M'       2010             3                    1290
    'Sample_3'    2168                 'Case'                 'M'       2010             3                    1280

    

nmr_experiment_info_path

The NMR_experiment_info.csv file contains the data about the experiment types of the samples, including the output folders, rack position and Instrument type.

This is information is necessary for mapping the experimental output to the sample data.

The file MUST contain the following: Sample ID, Experiment Number, Experiment Folder, Rack, Rack Position, Instrument, Acquisition Batch

Sample IDExperiment NumberExperiment FolderRackRack positionInstrumentAcquisition batch
Sample_110Serum_Rack1_SLT_1612131A1NMR011
Sample_111Serum_Rack1_SLT_1612131A1NMR011
Sample_112Serum_Rack1_SLT_1612131A1NMR011
Sample_220Serum_Rack1_SLT_1612131A2NMR011
Sample_221Serum_Rack1_SLT_1612131A2NMR011
Sample_222Serum_Rack1_SLT_1612131A2NMR011
Sample_310Serum_Rack2_SLT_1812132A1NMR011
Sample_311Serum_Rack2_SLT_1812132A1NMR011
Sample_312Serum_Rack2_SLT_1812132A1NMR011
Sample_421Serum_Rack2_SLT_1812132A2NMR011
Sample_422Serum_Rack2_SLT_1812132A2NMR011
Sample_423Serum_Rack2_SLT_1812132A2NMR011

Once imported and assigned to the variable spectra (see Output), the NMR experiment info is saved inside the nmr_experiment_info object.

>> spectra.nmr_experiment_info

ans =

  csm_nmr_experiment_info with properties:

       entries: [12x1 containers.Map]
    sample_ids: {'Sample_1'  'Sample_2'  'Sample_3'  'Sample_4'}
      filename: '/Users/ghaggart/workspace/matlab/csm-matlab-toolbox/toolbox/csm_tools/misc/example_data/raw_bruker/testNmr...'


% View the entry keys - they are unique
>> unique_ids = keys(spectra.nmr_experiment_info.entries)

unique_ids =

  Columns 1 through 5

    'Serum_Ra...'    'Serum_Ra...'    'Serum_Ra...'    'Serum_Ra...'    'Serum_Ra...'

  Columns 6 through 10

    'Serum_Ra...'    'Serum_Ra...'    'Serum_Ra...'    'Serum_Ra...'    'Serum_Ra...'

  Columns 11 through 12

    'Serum_Ra...'    'Serum_Ra...'


 % Get the first unique ids
>> unique_ids{1}

ans =

Serum_Rack1_SLT_161213-10


% Get a specific nmr_experiment_info_entry
>> nmr_experiment_info_entry = spectra.nmr_experiment_info.entries('Serum_Rack1_SLT_161213-10')

nmr_experiment_info_entry =

  csm_nmr_experiment_info_entry with properties:

                 unique_id: 'Serum_Rack1_SLT_161213-10'
                 sample_id: 'Sample_1'
         experiment_number: '10'
         experiment_folder: 'Serum_Rack1_SLT_161213'
                      rack: '1'
             rack_position: 'A1'
         acquisition_batch: '1'
                instrument: 'NMR01'
    spectrometer_frequency: []
                peak_width: []

    

NMR Experiment Info can also be accessed as a table by using:

>> spectra.nmr_experiment_info.getTable()

ans =

                 Unique_ID                 Sample_ID     Experiment_Number           Experiment_Folder            Rack    Rack_Position    Instrument    Acquisition_Batch
    ___________________________________    __________    _________________    ________________________________    ____    _____________    __________    _________________

    'Serum_Rack1_SLT_161213-10'    'Sample_1'    '10'                 'Serum_Rack1_SLT_161213'    '1'     'A1'             'NMR01'       '1'
    'Serum_Rack1_SLT_161213-11'    'Sample_1'    '11'                 'Serum_Rack1_SLT_161213'    '1'     'A1'             'NMR01'       '1'
    'Serum_Rack1_SLT_161213-12'    'Sample_1'    '12'                 'Serum_Rack1_SLT_161213'    '1'     'A1'             'NMR01'       '1'
    'Serum_Rack1_SLT_161213-20'    'Sample_2'    '20'                 'Serum_Rack1_SLT_161213'    '1'     'A2'             'NMR01'       '1'
    'Serum_Rack1_SLT_161213-21'    'Sample_2'    '21'                 'Serum_Rack1_SLT_161213'    '1'     'A2'             'NMR01'       '1'
    'Serum_Rack1_SLT_161213-22'    'Sample_2'    '22'                 'Serum_Rack1_SLT_161213'    '1'     'A2'             'NMR01'       '1'
    'Serum_Rack2_SLT_181213-10'    'Sample_3'    '10'                 'Serum_Rack2_SLT_181213'    '2'     'A1'             'NMR01'       '1'
    'Serum_Rack2_SLT_181213-11'    'Sample_3'    '11'                 'Serum_Rack2_SLT_181213'    '2'     'A1'             'NMR01'       '1'
    'Serum_Rack2_SLT_181213-12'    'Sample_3'    '12'                 'Serum_Rack2_SLT_181213'    '2'     'A1'             'NMR01'       '1'
    'Serum_Rack2_SLT_181213-20'    'Sample_4'    '20'                 'Serum_Rack2_SLT_181213'    '2'     'A2'             'NMR01'       '1'
    'Serum_Rack2_SLT_181213-21'    'Sample_4'    '21'                 'Serum_Rack2_SLT_181213'    '2'     'A2'             'NMR01'       '1'
    'Serum_Rack2_SLT_181213-22'    'Sample_4'    '22'                 'Serum_Rack2_SLT_181213'    '2'     'A2'             'NMR01'       '1'

    

nmr_calibration_info_path

The nmr_calibration_info.xlsx file contains the data about the calibration specification of the samples.

The file MUST contain the following: Sample ID, Sample Type, Calibration Type, Calibration Ref Point, Calibration Search Min, Calibration Search Max

Sample IDSample TypeCalibration TypeCalibration Ref PointCalibration Search MinCalibration Search Max
Sample_1Serumglucose   
Sample_2SerumTSP   
Sample_3Serumsingle03.033.07
Sample_4Serumglucose   

Once imported and assigned to the variable spectra (see Output), the NMR calibration info is saved inside the nmr_experiment_info object.

>> spectra.nmr_calibration_info

ans =

  csm_nmr_calibration_info with properties:

       entries: [4x1 containers.Map]
    sample_ids: {'Sample_1'  'Sample_2'  'Sample_3'  'Sample_4'}
      filename: '/Users/ghaggart/workspace/matlab/IMPaCTS/csm_tools/misc/example_data/raw_bruker/NMR_calibration_info.xlsx'


% View the entry keys - they are unique
>> sample_ids = keys(spectra.nmr_calibration_info.entries)

sample_ids =

    'Sample_2'    'Sample_1'    'Sample_4'    'Sample_3'


% Get a specific nmr_calibration_info_entry
>> nmr_calibration_info_entry = spectra.nmr_calibration_info.entries('Sample_2')

nmr_calibration_info_entry =


  csm_nmr_calibration_info_entry with properties:

                 sample_id: 'Sample_2'
               sample_type: 'Serum'
          calibration_type: 'TSP'
     calibration_ref_point: NaN
    calibration_search_min: NaN
    calibration_search_max: NaN

    

NMR Calibration Info can also be accessed as a table by using:

>> spectra.nmr_calibration_info.getTable()

ans =

    Sample_ID     Sample_Type    Calibration_Type    Calibration_Ref_Point    Calibration_Search_Min    Calibration_Search_Max
    __________    ___________    ________________    _____________________    ______________________    ______________________

    'Sample_2'    'Serum'        'TSP'               NaN                       NaN                       NaN
    'Sample_1'    'Serum'        'glucose'           NaN                       NaN                       NaN
    'Sample_4'    'Serum'        'glucose'           NaN                       NaN                       NaN
    'Sample_3'    'Serum'        'single'              0                      3.03                      3.07

experiment_path

The experimental_path is the folder that contains the raw Bruker output of the experiments.

The NMR experiment data must match this information.

Output

Spectra are imported and broken into pulse program (experiment type) and sample type.

  • imported.oneDWS : 1D Water suppressed (noesy)
  • imported.cpmg : CPMG
  • imported.diff_edited : Diffusion edited
  • imported.jres : J-RES
Each of these variables are a map container, where each key is the sample type (ie serum/plasma) as specified in the NMR calibration info file.
% Return a cell array of the sample types

>> serum_oneDWS_sample_types = keys(imported.oneDWS)

serum_oneDWS_sample_types =

    'Serum'

% See all the sample types:

>> imported.sample_types

ans =

    'Serum'


% Access and assign the spectra stored in the map

>> spectra = imported.oneDWS('Serum')

spectra =

  csm_nmr_spectra with properties:

     nmr_experiment_info: [1x1 csm_nmr_experiment_info]
    nmr_calibration_info: [1x1 csm_nmr_calibration_info]
           pulse_program: 'oneDWS'
             inputparser: []
         csm_data_hashes: [0x1 containers.Map]
                       X: [4x20010 double]
                 x_scale: [1x20010 double]
            x_scale_name: 'ppm'
           is_continuous: []
              sample_ids: {'Sample_1'  'Sample_2'  'Sample_3'  'Sample_4'}
         sample_metadata: [1x1 csm_sample_metadata]
              audit_info: [1x1 csm_audit_info]
                use_hash: []
                    name: []
             sample_type: 'Serum'
    
You can also view the errors. Will return the import exceptions.
>> keys(imported.import_errors)

ans =

    'cpmgpr1d'    'jresgpprqf'    'noesygppr1d'


>> imported.import_errors('cpmgpr1d')

ans =

     {}

imported.save_dir/imported_data.mat contains the saved object.

QC

During import, basic QC is run checking:

  1. Which Bruker Data is missing.
  2. Which Spectral Metadata is missing.
  3. Which NMR experiment info is missing.
  4. Which NMR calibration info is missing.
The results are saved in
imported.save_dir/importNMRlog.txt

More advanced analytical QC can be run by using csm_publish_qc_basic.

csm_publish_qc_basic( imported.oneDWS('Serum'), imported.peak_width_output_oneDWS('Serum'), imported.save_dir);

Copyright Imperial College London 2019