Depends on the scope of the project. Would the goal be to come up with a better ...

BlueTemplar · on Jan 11, 2021

The data in the database comes from a bidimensional matrix (LMNE) where leucocytes are classified on resistivity on one axis and light absorption (?) on the other. (I wonder how they managed the separation by absorption... indirectly via centrifugation ?) So I guess not really histological ?

Looks like it's a new model, I have no idea if they already have any ML models yet. There's also some database work.

I'm finishing a Masters degree in Computational Physics, so Linear Algebra and Probability shouldn't be an issue. (We also have an Image Processing and Analysis course.) I guess that's why they contacted us despite the fact that we don't have any ML training ?

Yeah, this is basically what I thought to do, but thank you for your advice !

n3ur0n · on Jan 11, 2021

Given your background, I think it would be worthwhile for you to pick up ESL [0] and read some relevant sections (supervised/sparse/linear methods). It's a great book and a good starting point for thinking about ML methods for high dimensional data.

Also, might be useful to took at webpages of some researchers in this space and courses they teach [1,2].

  [0] https://web.stanford.edu/~hastie/ElemStatLearn/  
  [1] https://scholars.duke.edu/person/dunson  
  [2] https://www.cs.princeton.edu/~bee/

BlueTemplar · on Jan 12, 2021

Thank you !

Funny (but I guess expected) to see the Markov Chain Monte Carlo method that we very recently learned in that book's table of contents ! (Unless it's another MCMC ?)