csm_import_raw_nmr is a new import system that imports, calibrates and interpolates spectra, and will import sample metadata and run basic analytical QC checks on the data.
Usage is either with dialog boxes:
>> imported = csm_import_raw_nmr()Or with specifying the major file paths in a struct:
>> ex = struct; ex.sample_metadata_path = strcat( csm_settings.getValue('toolbox_path') , filesep, 'csm_tools', filesep, 'misc', filesep, 'example_data', filesep, 'raw_bruker', filesep, 'sample_metadata.xlsx'); ex.nmr_experiment_info_path = strcat( csm_settings.getValue('toolbox_path') , filesep, 'csm_tools', filesep, 'misc', filesep, 'example_data', filesep, 'raw_bruker', filesep, 'NMR_experiment_info.xlsx'); ex.nmr_calibration_info_path = strcat( csm_settings.getValue('toolbox_path') , filesep, 'csm_tools', filesep, 'misc', filesep, 'example_data', filesep, 'raw_bruker', filesep, 'NMR_calibration_info.xlsx'); ex.experiment_path = strcat( csm_settings.getValue('toolbox_path') , filesep, 'csm_tools', filesep, 'misc', filesep, 'example_data', filesep, 'raw_bruker' ); imported = csm_import_raw_nmr('experiment_info', ex);
When importing, the system import sample metadata, run the preprocessing steps csm_calibrate_nmr and csm_interpolate_nmr, and run QC and handle any errors.
Specifications for how the system preprocesses the data are specified in the the required files.
Mixed sample types are allowed - so if you have racks with mixed sample types, just specify them in the calibration file. When imported, they will be imported into seperate csm_spectra objects, per sample type.
The experiment_path is the full path to the parent folder which contains the raw Bruker data.
The sample_metadata_path is the full path to the file which contains the metadata for the sample.
The nmr_experiment_info_path is the full path to the file which contains the mappings to the raw Bruker data.
The nmr_calibration_info_path is the full path to the file which contains the data necessary for spectral calibration.
For examples of this file, please look in csm_tools/misc/example_data/raw_bruker
Either enter these into the experiment_info struct when calling the import system, or select them during the dialog prompts.
The sample_metadata.xlsx file contains sample metadata about the experiment. These fields will be imported into the csm_nmr_spectra objects.
Sample ID | Further Sample info | Case-control pair | Sampling Date | Sampling Protocol | Ship order | Age | Gender |
Sample_1 | Case | 1518 | 2010 | 3 | 1269 | 63 | M |
Sample_2 | Control | 1518 | 2010 | 3 | 1270 | 63 | M |
Sample_3 | Case | 2168 | 2010 | 3 | 1280 | 63 | M |
Sample_4 | Control | 2168 | 2010 | 3 | 1290 | 63 | M |
Once imported and assigned to the variable spectra (see Output), the metadata is saved inside the sample_metadata object.
>> spectra.sample_metadata ans = csm_sample_metadata with properties: entries: [4x1 containers.Map] dynamic_field_names: {} filename: '/Users/ghaggart/workspace/matlab/IMPaCTS/csm_tools...' % View the entry keys >> keys(spectra.sample_metadata.entries) ans = 'Sample_2' 'Sample_1' 'Sample_4' 'Sample_3' % Get a specific sample_metadata_entry >> sample_metadata_entry = spectra.sample_metadata.entries('Sample_2') sample_metadata_entry = csm_sample_metadata_entry with properties: sample_id: 'Sample_2' dynamic_fields: [7x1 containers.Map] % Get the keys for the dynamic fields (column headers) >> keys(sample_metadata_entry.dynamic_fields) ans = Columns 1 through 5 'Age' 'Case-control pair' 'Further Sample info' 'Gender' 'Sampling Date' Columns 6 through 7 'Sampling Protocol' 'Ship order' % Get the value for column >> sample_metadata_entry.dynamic_fields('Age') ans = 63
Sample metadata can also be accessed as a table by using:
>> sample_table = spectra.sample_metadata.getTable() sample_table = Sample_ID Case_control_pair Further_Sample_info Gender Sampling_Date Sampling_Protocol Ship_order __________ _________________ ___________________ ______ _____________ _________________ __________ 'Sample_2' 1518 'Control' 'M' 2010 3 1270 'Sample_1' 1518 'Case' 'M' 2010 3 1269 'Sample_4' 2168 'Control' 'M' 2010 3 1290 'Sample_3' 2168 'Case' 'M' 2010 3 1280
The NMR_experiment_info.csv file contains the data about the experiment types of the samples, including the output folders, rack position and Instrument type.
This is information is necessary for mapping the experimental output to the sample data.
The file MUST contain the following: Sample ID, Experiment Number, Experiment Folder, Rack, Rack Position, Instrument, Acquisition Batch
Sample ID | Experiment Number | Experiment Folder | Rack | Rack position | Instrument | Acquisition batch |
Sample_1 | 10 | Serum_Rack1_SLT_161213 | 1 | A1 | NMR01 | 1 |
Sample_1 | 11 | Serum_Rack1_SLT_161213 | 1 | A1 | NMR01 | 1 |
Sample_1 | 12 | Serum_Rack1_SLT_161213 | 1 | A1 | NMR01 | 1 |
Sample_2 | 20 | Serum_Rack1_SLT_161213 | 1 | A2 | NMR01 | 1 |
Sample_2 | 21 | Serum_Rack1_SLT_161213 | 1 | A2 | NMR01 | 1 |
Sample_2 | 22 | Serum_Rack1_SLT_161213 | 1 | A2 | NMR01 | 1 |
Sample_3 | 10 | Serum_Rack2_SLT_181213 | 2 | A1 | NMR01 | 1 |
Sample_3 | 11 | Serum_Rack2_SLT_181213 | 2 | A1 | NMR01 | 1 |
Sample_3 | 12 | Serum_Rack2_SLT_181213 | 2 | A1 | NMR01 | 1 |
Sample_4 | 21 | Serum_Rack2_SLT_181213 | 2 | A2 | NMR01 | 1 |
Sample_4 | 22 | Serum_Rack2_SLT_181213 | 2 | A2 | NMR01 | 1 |
Sample_4 | 23 | Serum_Rack2_SLT_181213 | 2 | A2 | NMR01 | 1 |
Once imported and assigned to the variable spectra (see Output), the NMR experiment info is saved inside the nmr_experiment_info object.
>> spectra.nmr_experiment_info ans = csm_nmr_experiment_info with properties: entries: [12x1 containers.Map] sample_ids: {'Sample_1' 'Sample_2' 'Sample_3' 'Sample_4'} filename: '/Users/ghaggart/workspace/matlab/csm-matlab-toolbox/toolbox/csm_tools/misc/example_data/raw_bruker/testNmr...' % View the entry keys - they are unique >> unique_ids = keys(spectra.nmr_experiment_info.entries) unique_ids = Columns 1 through 5 'Serum_Ra...' 'Serum_Ra...' 'Serum_Ra...' 'Serum_Ra...' 'Serum_Ra...' Columns 6 through 10 'Serum_Ra...' 'Serum_Ra...' 'Serum_Ra...' 'Serum_Ra...' 'Serum_Ra...' Columns 11 through 12 'Serum_Ra...' 'Serum_Ra...' % Get the first unique ids >> unique_ids{1} ans = Serum_Rack1_SLT_161213-10 % Get a specific nmr_experiment_info_entry >> nmr_experiment_info_entry = spectra.nmr_experiment_info.entries('Serum_Rack1_SLT_161213-10') nmr_experiment_info_entry = csm_nmr_experiment_info_entry with properties: unique_id: 'Serum_Rack1_SLT_161213-10' sample_id: 'Sample_1' experiment_number: '10' experiment_folder: 'Serum_Rack1_SLT_161213' rack: '1' rack_position: 'A1' acquisition_batch: '1' instrument: 'NMR01' spectrometer_frequency: [] peak_width: []
NMR Experiment Info can also be accessed as a table by using:
>> spectra.nmr_experiment_info.getTable() ans = Unique_ID Sample_ID Experiment_Number Experiment_Folder Rack Rack_Position Instrument Acquisition_Batch ___________________________________ __________ _________________ ________________________________ ____ _____________ __________ _________________ 'Serum_Rack1_SLT_161213-10' 'Sample_1' '10' 'Serum_Rack1_SLT_161213' '1' 'A1' 'NMR01' '1' 'Serum_Rack1_SLT_161213-11' 'Sample_1' '11' 'Serum_Rack1_SLT_161213' '1' 'A1' 'NMR01' '1' 'Serum_Rack1_SLT_161213-12' 'Sample_1' '12' 'Serum_Rack1_SLT_161213' '1' 'A1' 'NMR01' '1' 'Serum_Rack1_SLT_161213-20' 'Sample_2' '20' 'Serum_Rack1_SLT_161213' '1' 'A2' 'NMR01' '1' 'Serum_Rack1_SLT_161213-21' 'Sample_2' '21' 'Serum_Rack1_SLT_161213' '1' 'A2' 'NMR01' '1' 'Serum_Rack1_SLT_161213-22' 'Sample_2' '22' 'Serum_Rack1_SLT_161213' '1' 'A2' 'NMR01' '1' 'Serum_Rack2_SLT_181213-10' 'Sample_3' '10' 'Serum_Rack2_SLT_181213' '2' 'A1' 'NMR01' '1' 'Serum_Rack2_SLT_181213-11' 'Sample_3' '11' 'Serum_Rack2_SLT_181213' '2' 'A1' 'NMR01' '1' 'Serum_Rack2_SLT_181213-12' 'Sample_3' '12' 'Serum_Rack2_SLT_181213' '2' 'A1' 'NMR01' '1' 'Serum_Rack2_SLT_181213-20' 'Sample_4' '20' 'Serum_Rack2_SLT_181213' '2' 'A2' 'NMR01' '1' 'Serum_Rack2_SLT_181213-21' 'Sample_4' '21' 'Serum_Rack2_SLT_181213' '2' 'A2' 'NMR01' '1' 'Serum_Rack2_SLT_181213-22' 'Sample_4' '22' 'Serum_Rack2_SLT_181213' '2' 'A2' 'NMR01' '1'
The nmr_calibration_info.xlsx file contains the data about the calibration specification of the samples.
The file MUST contain the following: Sample ID, Sample Type, Calibration Type, Calibration Ref Point, Calibration Search Min, Calibration Search Max
Sample ID | Sample Type | Calibration Type | Calibration Ref Point | Calibration Search Min | Calibration Search Max |
Sample_1 | Serum | glucose | |||
Sample_2 | Serum | TSP | |||
Sample_3 | Serum | single | 0 | 3.03 | 3.07 |
Sample_4 | Serum | glucose |
Once imported and assigned to the variable spectra (see Output), the NMR calibration info is saved inside the nmr_experiment_info object.
>> spectra.nmr_calibration_info ans = csm_nmr_calibration_info with properties: entries: [4x1 containers.Map] sample_ids: {'Sample_1' 'Sample_2' 'Sample_3' 'Sample_4'} filename: '/Users/ghaggart/workspace/matlab/IMPaCTS/csm_tools/misc/example_data/raw_bruker/NMR_calibration_info.xlsx' % View the entry keys - they are unique >> sample_ids = keys(spectra.nmr_calibration_info.entries) sample_ids = 'Sample_2' 'Sample_1' 'Sample_4' 'Sample_3' % Get a specific nmr_calibration_info_entry >> nmr_calibration_info_entry = spectra.nmr_calibration_info.entries('Sample_2') nmr_calibration_info_entry = csm_nmr_calibration_info_entry with properties: sample_id: 'Sample_2' sample_type: 'Serum' calibration_type: 'TSP' calibration_ref_point: NaN calibration_search_min: NaN calibration_search_max: NaN
NMR Calibration Info can also be accessed as a table by using:
>> spectra.nmr_calibration_info.getTable() ans = Sample_ID Sample_Type Calibration_Type Calibration_Ref_Point Calibration_Search_Min Calibration_Search_Max __________ ___________ ________________ _____________________ ______________________ ______________________ 'Sample_2' 'Serum' 'TSP' NaN NaN NaN 'Sample_1' 'Serum' 'glucose' NaN NaN NaN 'Sample_4' 'Serum' 'glucose' NaN NaN NaN 'Sample_3' 'Serum' 'single' 0 3.03 3.07
The experimental_path is the folder that contains the raw Bruker output of the experiments.
The NMR experiment data must match this information.
Spectra are imported and broken into pulse program (experiment type) and sample type.
% Return a cell array of the sample types >> serum_oneDWS_sample_types = keys(imported.oneDWS) serum_oneDWS_sample_types = 'Serum' % See all the sample types: >> imported.sample_types ans = 'Serum' % Access and assign the spectra stored in the map >> spectra = imported.oneDWS('Serum') spectra = csm_nmr_spectra with properties: nmr_experiment_info: [1x1 csm_nmr_experiment_info] nmr_calibration_info: [1x1 csm_nmr_calibration_info] pulse_program: 'oneDWS' inputparser: [] csm_data_hashes: [0x1 containers.Map] X: [4x20010 double] x_scale: [1x20010 double] x_scale_name: 'ppm' is_continuous: [] sample_ids: {'Sample_1' 'Sample_2' 'Sample_3' 'Sample_4'} sample_metadata: [1x1 csm_sample_metadata] audit_info: [1x1 csm_audit_info] use_hash: [] name: [] sample_type: 'Serum'You can also view the errors. Will return the import exceptions.
>> keys(imported.import_errors) ans = 'cpmgpr1d' 'jresgpprqf' 'noesygppr1d' >> imported.import_errors('cpmgpr1d') ans = {}
imported.save_dir/imported_data.mat contains the saved object.
During import, basic QC is run checking:
More advanced analytical QC can be run by using csm_publish_qc_basic.
csm_publish_qc_basic( imported.oneDWS('Serum'), imported.peak_width_output_oneDWS('Serum'), imported.save_dir);
Copyright Imperial College London 2019