kirk-borneDr. Kirk Borne is a data scientist and an astrophysicist. He is Principal Data Scientist in the Strategic Innovation Group at Booz-Allen Hamilton since 2015. He was Professor of Astrophysics and Computational Science in the George Mason University (GMU) School of Physics, Astronomy, and Computational Sciences during 2003-2015. He served as undergraduate advisor for the GMU Data Science program and graduate advisor to students in the Computational Science and Informatics PhD program. Prior to that, he spent nearly 20 years supporting NASA projects, including NASA’s Hubble Space Telescope as Data Archive Project Scientist, NASA’s Astronomy Data Center, and NASA’s Space Science Data Operations Office. He has extensive experience in large scientific databases and information systems, including expertise in scientific data mining. He was a contributor to the design and development of the new Large Synoptic Survey Telescope (LSST), for which he contributed in the areas of science data management, informatics and statistical science research, galaxies research, and education and public outreach.

We are proud to have Kirk presenting at Data Natives 2015!

How did you make the jump from astronomy to data science?

The jump was a gradual one (a lifelong evolution) for me. As an astronomer, I was always working with data from telescopes of all sizes. My experience with these various scientific instruments and observatories in different parts of the world (including telescopes in the USA, in Chile, and at the German-Spanish Calar Alto Observatory in Spain) led me to work as a research scientist for the Hubble Space Telescope in the 1980’s. My service work at the Space Telescope Science Institute included database design, development and report generation, which ultimately led to my appointment as NASA’s Hubble Data Archive Project Scientist. We created one of the world’s first major public (open) data repositories for scientific researchers: open access, user-friendly search, web-based interfaces, and more.

This work eventually opened up new opportunities for me at NASA, and I became a contract manager in 1995 for NASA’s Astrophysics Data Facility and Astronomy Data Center, within the USA’s National Space Science Data Center (NSSDC). As I worked with data more and more in my daily professional life, developing better tools for management, indexing, search, access, analysis, visualization, and discovery, it was inevitable that a transition in my research and professional interests would migrate toward data science.

Can you describe the journey that led to where you are today, and your major influences along the way?

Usama Fayyad, the first Chief Data Officer
Usama Fayyad, the first Chief Data Officer

When I was at the NSSDC in the 1990’s, we catalogued, archived, curated, and provided access to over 15,000 space science datasets from many thousands of space instruments. It was during this period that I began to notice a dramatic increase in the size of the datasets that we were ingesting. The biggest jump occurred in 1998 when we were asked to ingest one single experiment’s data, whose total volume of 2 Terabytes more than doubled the total data volume (1 Terabyte) of the other 15,000 datasets combined! It was then that I realized that things were dramatically changing in the expanding “data universe”. I began to explore the power of data mining and machine learning to make discoveries in massive data – I was fascinated with finding the “unknown unknowns” in large data collections. I was influenced greatly by several groups, including: (1) the work at IBM (on the Advanced Scout project that mined the play-by-play databases for professional basketball games); (2) the customer-based data mining and marketing efforts at Capital One Credit Card Company (which was described in a business magazine that I found in the office snack room one day); and (3) and Usama Fayyad (who was working at NASA’s Jet Propulsion Lab with astronomers to mine large galaxy databases). That was the conspiracy of events and inspirational persons that influenced me the most at the beginning of my transition from traditional astronomical research (studying colliding and merging galaxies) to multidisciplinary data-driven research in numerous disciplines, organizations, and industries. Those influences led me to data science and into becoming the data scientist that I am today.

The full transformation for me occurred in 2003 when I left NASA (after 18 years) to become a faculty member at George Mason University in the Computational Science and Informatics (Data Science) PhD program. Ultimately, my colleagues and I launched the world’s first undergraduate Data Science degree program in 2007. The world was taking note of big data and data science, and we were right there in the middle of it. After teaching, advising, and doing research in data science at the academic level for 12 years, I was hungry to do more, in a broader context, for more organizations, in a variety of industries where data analytics is changing the world. When Booz Allen Hamilton offered me that opportunity as their Principal Data Scientist in the NextGen Analytics and Data Science group (consisting of more than 500 data scientists), I left the university and joined this remarkable team in May 2015.

What major milestones or landmark events have stood out to you during your time in the industry? What are you still waiting for?

The first milestone goes back over 30 years when professional organizations and societies formed initiatives around data mining and knowledge discovery from data. The creation of the kdnuggets newsletter by Gregory Piatetsky-Shapiro in 1993 was a definite landmark event in the history of data science. The birth of Google is another big one – they set out to do more than be a search engine company, but to organize all of the world’s information and to make it universally accessible and useful, which they accomplish with some amazing mathematics (linear algebra, the one university class that I took 40 years ago that is now on the top of my recommendations when new students ask me how to get into data science). Then there was a series of events that brought out the importance of data mining (and data science): the terror attacks in the USA on September 11, 2001; the Washington DC sniper case of 2002; and the now-famous Walmart strawberry pop-tarts data mining story of 2004. Those isolated incidents were not isolated in my mind or in the minds of data scientists – the power of data to know the world, to improve the world, and to change the world was increasingly more visible to everyone.

'The power of #data to know the world, to improve the world' - @KirkDBorne Click To Tweet
Internet of Things: One of the three major changes coming for data science.
Internet of Things: One of the three major opportunities for data science.

Then, in 2011-2012, there were three landmark events that changed everything for us: (a) the publication of the McKinsey research report “Big data: The next frontier for innovation, competition, and productivity” in 2011 emphasized the dramatic shortage in data professionals that the workforce would be facing within the next few years; (b) the President of the USA announced the National Big Data Initiative (including several hundred million dollars of research investments); and (c) the publication of an article in the prestigious Harvard Business Review with the title “Data Scientist: The Sexiest Job of the 21st Century.” From my perspective, those three events launched the Data Science revolution that we are now experiencing. The next big changes will include the internet of things (with ubiquitous sensors everywhere, collecting massive streams of data on everything, all the time), also faster analytics (in-memory, on-the-chip, cleverer algorithms, machine learning on clusters and in the cloud, quantum machine learning, and more), and greater conversion of all businesses into data businesses.

In the 30 years since you joined the Space Telescope Science Institute what have been the most significant challenges you have faced?

One of the biggest challenges that everyone has faced in this field is cultural inertia, specifically resistance to change. Whether we look at academic institutions, government agencies, or commercial businesses, there have been a lot of folks who criticized, minimized, or otherwise ignored the revolution that was growing around them. Trying to get organizations, industries, or professions to change requires years of patience, persistence, and perspiration. Those of us who lived through those challenges are seeing the fruits of our labors now, in fact it is not just fruits but entire forests of opportunities! I learned through the years that the best way to bring about big changes fast is to go right to the top – so I was lead author on two position papers in 2009 that were submitted to the USA’s National Research Council of the National Academies of Science. One paper was focused on the transformation of my field (Astronomy) into a data-oriented data science research discipline (Astroinformatics), and the second paper was focused on changing the education system (not just in astronomy, but in all aspects of school-based learning and lifelong learning) by incorporating “Data Science for the Masses” everywhere in all learning settings. Those papers got noticed by significant persons, and the transformations are now well underway.

One of the biggest challenges that everyone has faced in this field is cultural inertia. Click To Tweet

The challenges that we previously faced have not completely evaporated, but there is hope that they are fading. We are seeing now the resistance is not so much from the leadership within organizations, but from the mid-level workforce – they haven’t entirely embraced the changes that a data-driven business requires, but they are moving in the right direction. The leaders of organizations are now encouraging, sponsoring, and rewarding such transformations in people, processes, and products. That is exactly what my company Booz Allen Hamilton is doing, and it is a wonderful thing to be part of.

Of your achievements and accolades so far, of which are you most proud?

One of the many projects Kirk has contributed to: The Hubble Space Telescope

That is a tough question, but I presume that you are not referring to my wonderful family, children, and grandchildren. I am most proud of my humility. (Hint: that was a small joke.) Seriously, I am humbled by the opportunities, talents, and aptitudes that I have been given. So, when I say that “I am proud of…”, what I really mean to say is that “I am humbled by…”. So, here we go… I am proud that I survived a very tough undergraduate education in Physics. I am proud that I completed my doctoral degree in one of the world’s top astronomy programs (Caltech). I am proud of the awards that I won for my work on the Hubble Space Telescope project. I am proud of the innovations that my group at the NSSDC created around the use and exploration of large datasets. I am proud of co-creating the world’s first undergraduate Data Science degree program. I am proud of my PhD students who have produced some incredible doctoral dissertations. I am proud of the Faculty Impact Award that I received from the Dean of George Mason University’s College of Science. I am proud to be among the worldwide top influencers in big data and data science. I am proud to be a member of the awesome Booz Allen Hamilton data science team. And I am proud to be a part of the Data Natives community of data-driven world-changing innovators. (As you can see, I have a lot to be grateful for – hence the humility!)

What kind of problems are you aiming to tackle in your role at Booz Allen Hamilton?

I get great pleasure in finding solutions through data. The scientist in me loves to explore data (evidence), find new discoveries, develop a hypothesis to explain it, and then test those theories. At Booz Allen Hamilton I am privileged to exercise that Data Scientist role across numerous internal and external projects: human resources, organizational change, training and mentoring, marketing, customer engagement, behavior analytics, risk mitigation, novelty discovery (including fraud detection, anomaly detection, surprise discovery), thought leadership, socialization of data science across organizations and industries, data technologies (including machine learning, data management innovations, graph analytics), predictive and prescriptive analytics, geospatial-temporal modeling, and much more. I feel like a child in a candy store most of the time, and I have to exercise some good judgement as to where to get involved. It is tempting to get involved in too many things.

What advice would you give to the Kirk Borne of 30 years ago? Anything you would have done differently?

I wrote a blog for MapR about my “growth hacker’s journey”. My message throughout that self-history was that I was often in the right place at the right time. I usually didn’t recognize that reality when I was experiencing each of those career moments. It took many years of hindsight to see the growth trajectory, how it was born, and how it took shape. So, I guess I wouldn’t have done things too differently (maybe on the micro level, but definitely not on the macro level). I would tell myself of 30 years ago the same things that I would tell a young person today at the start of their career:

(1) Don’t over-plan (over-specify) your career, since it will evolve in unexpected ways.
(2) Expect to find the essential value and meaning of your work later in life for the things that you are doing now, and that’s okay.
(3) Be ready to make big changes when the opportunities present themselves (just as I left the great Hubble Space Telescope project to pursue bigger opportunities, then I left NASA after 18 years to create a new academic discipline, and then I left a tenured Full Professorship in an innovative university in order to join a brilliant revolutionary business).
(4) Trust your training – I am amazed at how all sorts of things that I learned and experienced have become relevant and useful at different stages of my life.
(5) Be more tolerant of your own mistakes – remember this saying: “Good judgement comes from experience, and experience comes from bad judgement.”
(6) Listen to, but don’t react to the naysayers – do what you know is the right thing for you.
(7) Your aptitudes will be more valuable to you in the long term than your skills (aptitudes include the 7 C’s: cool under the pressure of hard work, courageous problem-solver, curious, creative, communicative, collaborative, and commitment to lifelong learning).
(8) And finally, don’t confuse your job with your career – I had many jobs, but I have had only one career: being a scientist who loves to make discoveries from data. When your lifelong passion becomes your career, hold on to that very tightly.

Which companies or individuals inspire you, and keep you motivated to achieve great things?

Of course, I have to mention my current employer Booz Allen Hamilton, which is making all of the right moves in the area of data analytics and data science. I also am inspired by the top big data and data science influencers – some of them are younger than my own children, and that excites me to see that the next generation is embracing the power of data to transform the world. I am inspired by organizations who apply data for social good, which includes Booz Allen, DataKind, Bayes Impact, Kaggle, and many others. I am motivated by the awesome and fast accomplishments, discoveries, and innovations that are occurring all around us in the data analytics world: in government, businesses, and academia. I have shared the stories of so many of those companies and organizations on Twitter, I cannot begin to keep count of how many — though maybe my 44,000 tweets are a good estimate of the number of data stories that have been worth sharing with the data science social community. Those are the people (my faithful and fearless followers who track my Twitter firehose of data stories) who are definitely the individuals who keep me motivated! The flood of new and interesting data sources everyday motivates all of us data natives to achieve great things, because “Data is what we do”. This is a great time to be doing data!

'This is a great time to be doing #data!' - @KirkDBorne Click To Tweet

(image credit: Evan Bench, CC2.0)

Previous post

7 Hot FinTech Startups of 2015

Next post

The Applications of Machine Learning Through Unstructured Text Data