On Thursday, October 3, 2013 9:01:07 AM UTC+13, Chris Bishop wrote: > Hello, > > > > I am trying to put together a plan that allows me to take full advantage of the recent MATLAB toolbox and computer hardware developments ... I was hoping someone might be able to help me vet my approach. > > > > I will be analyzing neuroscientific data (brain waves, etc.) of multiple subjects using open source software available through matlab (EEGLab, ERPLab). Analysis pipelines will rely in part on the Signal Processing Toolbox, Statistics Toolbox, and a few other minor toolboxes. Each subject's data will undergo a nearly identical analysis from start to finish - that is, one subject's data is essentially a self contained processing event. Data files will vary in size, but could easily exceed several GB per subject. > > > > Cluster based computing options are not a cost effective solution for us at the moment, so a stand alone work station is our best option at the moment. I plan to couple a dual CPU (4 cores/CPU, ~2.0 GHz clock speed) equipped with 32 GB RAM and ample storage space (2, 2 TB drives with mirror RAID for data redundancy) with the parallel computation toolbox in MATLAB. > > > > My questions are as follows: > > > > 1. Will I be able to take advantage of all cores using the parallel computing toolbox? > > 2. What are the likely bottlenecks (hardware or otherwise)? > > 3. Is this data analysis pipeline likely "parallelizable"? > > 4. Does anyone have an alternative solution or build suggestions? > > > > Thank you for your time. > > -Chris
In my experience using Fortran on a Cray, unless you have a really large problem, parallelisation is not worth the effort.
In your case, you want to run the same pipeline many times in parallel, rather than splitting the computation of each pipeline between many processors, so why not simply open many Matlabs and run the pipeline on each one? The operating system will sort out the allocation of resources between the different Matlabs.