baikit.WrangleData

class baikit.WrangleData(manifest_fn, manifest_dir='input/manifest/')

Bases: Data

Wrangling data

Wrangles data for later use.

__init__(manifest_fn, manifest_dir='input/manifest/')
Parameters:
  • manifest_fn (str) – Manifest filename.

  • manifest_dir (str, optional) – Manifest directory. Default: “input/manifest/”

Methods

__init__(manifest_fn[, manifest_dir])

param manifest_fn:

Manifest filename.

find_peak(data, peakregion_boundaries)

Find single peak

find_peaks(data, peakregion_boundaries)

Wrapper of find_peak()

load_data(manifest_line_index)

Convert manifest line to data

load_manifest()

Load manifest

raman_calib(data)

Calibrate Raman spectrum

save_data(data, line_fn, line_tag)

Convert data to manifest line

shift_col0(data, peakregion_par[, ...])

Shift column 0

stretch_col1(data, peaksregion_boundaries, ...)

Stretch column 1

unique_col0(data)

Find the unique values in 1st column

load_data(manifest_line_index) tuple[numpy.ndarray, str, str]

Convert manifest line to data

Loads data from manifest line.

Parameters:

manifest_line_index (int) – Line number of the manifest line.

Returns:

Data, line filename, and line tag.

Return type:

tuple[numpy.ndarray, str, str]

save_data(data, line_fn, line_tag)

Convert data to manifest line

Saves data as specified in manifest line, and print manifest line.

Parameters:
  • data (numpy.ndarray) – Data to save.

  • line_fn (str) – Line filename.

  • line_tag (str) – Line tag.

unique_col0(data) ndarray

Find the unique values in 1st column

Returns data where the values in 1st column are unique (duplicates removed) and sorted.

Parameters:

data (numpy.ndarray) – Input ndarray.

Returns:

Output ndarray.

Return type:

numpy.ndarray

find_peak(data, peakregion_boundaries) tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray]

Find single peak

Finds a single peak within peak region boundaries.

Parameters:
  • data (numpy.ndarray) – Data.

  • peakregion_boundaries (numpy.ndarray) – ndarray of the peak region boundaries of the peak.

Returns:

Data within peak region, peak value, and data generated from fitted model.

Return type:

tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray]

find_peaks(data, peakregion_boundaries) ndarray

Wrapper of find_peak()

Finds multiple peaks within a series of peak region boundaries.

Parameters:
  • data (numpy.ndarray) – Data.

  • peakregion_boundaries (numpy.ndarray) – A 2-D ndarray of peak region boundaries.

Returns:

Peaks values.

Return type:

numpy.ndarray

shift_col0(data, peakregion_par, col0_precision=4) tuple[numpy.ndarray, numpy.ndarray, float]

Shift column 0

Shifts column 0 of data according to peakregion_par.

Parameters:
  • data (numpy.ndarray) – Input data.

  • peakregion_par (numpy.ndarray) – An ndarray of two elements, the 1st one is the peak center, the 2nd one is peak half width.

  • col0_precision (int, optional) – Column 0 data output precision. Default: 4

Returns:

Output data, peak value, and diff.

Return type:

tuple[numpy.ndarray, numpy.ndarray, float]

stretch_col1(data, peaksregion_boundaries, height, col1_precision=4) tuple[numpy.ndarray, float]

Stretch column 1

Stretches column 1 of data to height.

Parameters:
  • data (numpy.ndarray) – Input data.

  • peaksregion_boundaries (numpy.ndarray) – An ndarray of two elements, the 1st one is left boundary, the 2nd one is right boundary.

  • height (float) – The average height within boundaries that the data are stretched to.

  • col1_precision (int, optional) – Column 1 data output precision. Default: 4

Returns:

Output data and coeff.

Return type:

tuple[numpy.ndarray, float]

raman_calib(data) ndarray

Calibrate Raman spectrum

Calibrates shift with Si’s 1st order peak position (520 cm^-1). Calibrates intensity with Si’s 2nd order peak average height.

Parameters:

manifest_line_index (int) – Line number of the manifest line.

Returns:

Calibrated Raman spectrum.

Return type:

numpy.ndarray