DATA: Programming support for working with large data files

Summary:

Increasingly, research systems used in Psychology are generating large volumes of data (e.g. eye trackers, eeg,…) The Psychology Technical group have recently provided programming support to help with processing eye-tracking and Brain Vision data and are always happy to help out, if possible by writing bespoke programs.

Contact us on x7995 or at psychsupport@kent.ac.uk to arrange a time to discuss your data and whether we might be able to help.

See below for more details of a couple of recent projects

Detailed:

Example 1 – Eye tracking pupil size data

With 28 data files, each with ~70,000 lines of pupil-size data looking like this:


1  13   Neutral   180   162.94    fixation   2.3834   2.396 0  0
1  13   Neutral   180   179.69    fixation   2.321    2.383 0  4
1  13   Neutral   180   196.32    fixation   2.4191   2.374 0  0
1  13   Neutral   180   212.94    stimulus   2.3266   2.401 0  4
1  13   Neutral   180   229.57    stimulus   2.3441   2.298 0  0

A program was written to process each file, one at a time to. For each file, baseline average values were calculate for the 500ms period pre-stimulus and then the minimum value, post stimulus was identified and further 500ms period averages was calculated with the summary results saved in separate results files. This program was written in the Python programming language. The program took less than a minute to process all the data files and was written in a manner so that it’s straightforward to modify the analysis performed so it’s ideal for situations where you’re not sure exactly how you want to analyse your data and where you need to explore it first.

 

Example 2 – Brain Vision – mass, automated editing of trigger files

Upon opening BrainVision (BV), the trigger file (.vmrk) is read and integrated with the raw data file (.eeg).  However, once this has been done, the trigger file is no longer used and the only way to access it again is to delete the history files, including ICAs and artifact rejections.  If triggers need updating without loosing your history, a compatible .csv file, a format entirely different from the .vmrk file, must be imported.

A program was written that will created new versions of the VMRK (xml) files on a list of trigger names and times taken from a separate, updated csv file [ created as follows:  the triggers from an export file in .csv format can be modified in excel (either by hand or algorithm) ].

One example is needing to update all triggers, for all participants, with their behavioural data so as to indicate whether each trial was answered correctly or incorrectly, or even to indicate a particular response time window.  Using code, this modification can be performed in seconds instead of days.

Example csv file extract:

Blink,1,
S  9,2723,
S101,3481,
S  7,4748,
S 81,5515,

Example VMRK file extract:

<Marker>
<Type>Stimulus</Type>
<Description>S  8</Description>
<Position>33698</Position>
<Points>1</Points>
<Channel>All</Channel>
<Date>0001-01-01T00:00:00</Date>
</Marker>

 

If you have any data analysis questions, contact us on x7995 or at psychsupport@kent.ac.uk to arrange a time to discuss your data and whether we might be able to help.

Leave a Reply