Stop chasing unicorns and simply focus on what needs to get done.
This is my understanding of what really matters:
Every solution is only as good as the definition of the problem that needs to be addressed and solved.
Obviously, the person supposed to come up with a clever solution needs to know his way around your domain to do that.
Keep asking the "why" question until you really peeled away the onion's layers and decomposed the problem into its atomic bits and pieces.
Now you have a starting point to apply systematic exploration of possible solutions
Without data, you cannot really do much. So identify what you have and what you need. If there's a gap, you have some upfront feasability checking to do to see whether or not you can close the gap.
Be then pragmatic in a very systematical way about how to access the data you need. Not every problem in the world needs a hyper resilient real-time-enabled microservice tech stack running in the cloud!
CSV and simple RDBMS structures will definitely get you off the ground. But do be VERY systematic about setting up even those simple data backends. Because if you'll be successful, the day of scaling up WILL come. But that's another story, really.
Assess which algorithm is most likely to get you at least a bit closer to the answer.
Start simple before getting out that sledgehammer if you are only after cracking a nut. If it has to be Deep Learning than go for it. In the end it's all just Linear Algebra and some R or Python code that you need to understand.
Unless you really need to tap into TRUE Artifical Intelligence, it's really not rocket science.
There is just ONE very crucial aspect: be VERY explicit about your learnings as you try out algorithms and potential solutions. But we'll cover that in the next section.
In the end, whether or not you can successfully bring it all together is simply a matter of organizing the entire end-to-end process well. Set yourself up for an iterative approach from the very beginning and be very very very systematic about keeping track of all your moving parts.
Keeping track means to version about everything that is subject to change and impacts any part of your data product or solution. It's usually pretty much everything.
Okay, I agree: this is BY FAR the hardest one to get right. But it is learning by doing anyway since you won't find any generic one-size-fits-all approach that automagically works for YOU. But you'll get there if you're committed to it and are somewhat familiar with agile working styles.
Being a “former business guy that turned geek” during my studies – which in turn led me to “Statistical Algorithms Bootcamp” as a PhD student long before Data Science was even a thing – Data Science, Data Engineering and Computer Science since have long become my preferred arenas in which my technical skills, creativity and curiosity perfectly unite to groove together.
As pretty much every aspect of the Data Engineering/Data Science spectrum I came across so far truly fascinated me, and continues to do so, I generally want to know how things actually work – all the way down to the nitty gritty details. Thus, I am confident in saying that I know my way around the vast scope of the Data Engineering and Data Science landscape quite well by now.
Does that make me one of those infamous “all-in-one” unicorns that silicon-valley-led bandwagon long strived for, until seemingly settling for a functionally highly specialized organizational setup à la Adam Smith? No, not at all.
It simply means that I am a very hands-on type of guy that generally loves having the actual skills and expertise to potentially “own” as many of the tech stack, engineering, modelling, project management and communicational aspects as possible involved in building data products and services. I generally strive for putting myself in a position to get things done “end-to-end” whenever possible – instead of “going narrow” and then having to rely on specialized experts covering those parts of the organizational or infrastructural pipeline that are prerequisites for my own work to happen.
Obviously, this generalist approach does imply that the depth of my expertise in any given area usually cannot rival that of a narrowly specialized expert – at least not out-of-the-box. Because while others may refer to my skillset as a being “T-shaped”, I would rather describe it as being “tree-shaped” as knowledge is inherently organic: I am able to adjust quickly and thoroughly to the demands at hand – extending branches, growing new leaves, even growing new branches if necessary. The notoriously curious and inquisitive person that I am, being confronted with new challenges that require me to learn and apply new skills is something that I truly enjoy – so doing gardening on my skill tree comes very natural to me.
All those unicorns and trees set aside, you could also simply say that I am a Full Stack Data Scientist that loves to code in R and Python, also keeps an eye on things like Julia and Scala, is very comfortable wearing multiple organizational hats, speaks several dialects of "tech", is also fluent in "business" and thus shines where ever business functions intersect.
I am especially good in decomposing complex problems into manageable parts, in finding creative analytical & programmatical approaches to tackling those parts and eventually distilling insights down to the essentials, which I in turn communicate adequately to key stakeholders and decision makers. A good intuition, a high level of perseverance and an aptness for systematic DevOps, DataOps and agile planning approaches while still applying a sound "80/20" pragmatism help me to stay on top of things.