Ignacio Elola is a self-proclaimed ‘data nerd, data punk and data scientist’ at import.io , a young startup that’s shaking up the world of data. With their free app, you can transform any website into a table of data or an API in minutes. Recently voted Best Startup by O’Reilly Strata Santa Clara, GigaOM and Web Summit, it has been backed by top European VCs and Silicon Valley-based angel investors.

Follow Peadar’s series of interviews with data scientists here.


What project have you worked on do you wish you could go back to, and do better?

All of them. I’m constantly learning and improving and if I could go back I could do all past projects much better. That doesn’t mean I wish to re-do all past projects, as when something is working is working and is done, but for important projects is a good practice on my opinion to keep iterating and re-factoring code, as every month I learn something new that could help doing this better


What advice do you have to younger analytics professionals and in particular PhD students in the Sciences?

Two words: do it. The only way to really learn something is by doing; so be proactive and start getting things done and learning in the process. I would also advice against specializing too much into something unless you have things very clear, a generalist can always get specialized  something later on, but the other way is harder. Plus it would be much beneficial in any early stage career to learn as much as possible from any related disciplines and any business aspects, not only the algorithm or statistics you are working on. Know your environment and learn from everybody around you.

What do you wish you knew earlier about being a data scientist?

I really haven’t find any bad surprises on my journey, things that I wish I knew early. I think keeping an open mind approach about your role and your company and everything else help a lot on this.

How do you respond when you hear the phrase ‘big data’?

Well, I think “big data” really change the data and the technology space in terms of what tools (databases, search indexes, and so) we need to use to deal with these amounts of data. But the real revolution it started is a mentality revolution: the “all data is useful” thinking, the data driven approach for decision making… it is al related, we can see how it is already having a real impact in startups, medium companies and big enterprises. That is an approach that can be used in “big” or “small” data, it doesn’t matter and most of the time people actually work with small or medium data, not so many companies are actually doing “big data”. But that is okay!

What is the most exciting thing about your field?

The thing I find the most exciting is to be able to work with different teams and departments and help everyone in their decision process by using data. I just love to improve processes and open everybody mind to the data driven world!
Having the freedom to came-up with new ideas and projects to create value out of the data you have in unexpected ways is also something very challenging but rewarding, and I think is a must have in any data science role.

How do you go about framing a data problem – in particular, how do you avoid spending too long, how do you manage expectations etc. How do you know what is good enough?

The starting point need to be the business: what question are you trying to solve. I’m very pragmatic in framing data problems, and very output oriented. First thing is to formulate a question that makes sense and that will help you in some way, and understand the business problem you are trying to solve or improve – otherwise you won’t be able to know how good is your answer later on!
Then is the turn of the data itself: what data do you have and how you can use it to answer that question, how close can you get to answering that question? What algorithm do you need to use or how to clean the data are things of technical difficulty, but where you’ll find many resources to help you in the way: courses, books, tutorials, blogs… That’s why I find those first steps the most important ones.

 


unnamedPeadar Coyle is a Data Analytics Professional based in Luxembourg. He has helped companies solve problems using data relating to Business Process Optimization, Supply Chain Management, Air Traffic Data Analysis, Data Product Architecture and in Commercial Sales teams. He is always excited to evangelize about ‘Big Data’ and the ‘Data Mentality’, which comes from his experience as a Mathematics teacher and his Masters studies in Mathematics and Statistics. His recent speaking engagements include PyCon Sei in Florence and he will soon be speaking at PyData in Berlin and London. His expertise includes Bayesian Statistics, Optimization, Statistical Modelling and Data Products.

Previous post

Cyber Security Startup Niara Unveils Security Intelligence Solution Powered By Big Data Analytics

Next post

'Streams in the Beginning, Graphs in the End' - Part II: Sensors, Event Streams and ‘Upside Down’ databases