An Interview with a Lead Data Scientist

CQF alumnus, Dmytro Iefremov, is a Lead Data Scientist at Mastercard. We spoke to Dmytro about the evolution of his career in the field, the next big thing for AI, and his advice to aspiring professionals.

What inspired you to pursue a career as a quant, and how did you get started?

Technically I'm not a quant right now - I'm a data scientist - but it's a highly related field. In terms of what inspired me, I was eager to find my niche at the intersection of math, economics, and technology. I have two master’s degrees, one in economics and one in computer science, specifically around machine learning. My bachelor’s degree was in economics as well, but I realized that I'm more interested in mathematics, rather than the social sciences.

The CQF gave me an initial boost and a solid introduction to machine learning, which I've been using throughout my career.

How did the CQF program impact your career trajectory?

The CQF was a good entry point for quant finance, and I learned a lot. Moreover, I learned what I liked and didn't like about my career and figured out where I should head next. The CQF gave me an initial boost and a solid introduction to machine learning, which I've been using throughout my career. It was a good start for my technology career because after the CQF, I worked for software companies that were working with quantitative finance, for example, FinCad, which is a company that pretty much does the stuff we learned on the program. Quantitative finance is very broad and so I had exposure to multiple areas of it. The CQF was a great mix of everything related to quantitative finance and I found my niche within financial technology as a machine learning data scientist specialist.

Can you describe a typical working day in your role?

I have a mixed set of responsibilities because I'm a Lead, so I am involved in multiple projects, but more or less I'm a research and development unit within the company. At the day-to-day level, I build AI agents to automate some of the things used in investigation, before model debugging or analysis. I'm building machine learning pipelines, and I work with an interesting set of technologies. At MasterCard, we deal with a truly big dataset, related to transactions. So, I work with forms of distributed computing, where Spark is our go-to language for most things. Back in earlier days, I used C or C++ to write extensions to Python in rare cases where there were bottlenecks or other considerations, but nowadays there are libraries for everything, so I don't use C++ directly anymore. Python itself is potentially not the most efficient language, but as a high level interface to C libraries, it does a great job. Also, Python has been growing dramatically and the ecosystem is huge.

What's the most interesting or challenging project you've worked on?

The most interesting was probably a pre-LLM project on natural language processing. In those days, we had a model for every single language aspect, like a model to determine what part of language it is, for example whether it is a noun or a verb. We literally had a model for every tiny piece and in order to build a working natural language processing system, you had to be really creative. Nowadays, everything is pre-trained for you by large tech firms and frankly, it is not as fun anymore. I liked that old project because it took some creativity, specifically because all the models are kind of shallow and you had to develop many layers and many algorithms combined into a single pipeline versus today where you can just ask an LLM with one prompt.

How are you using AI in your current role?

I build agents for MasterCard, and I use coding assistants because it's so much easier. When I was doing the NLP project, it felt like we would never solve the problem ever. It felt like the model was too stupid and we needed to come up with something smarter. As it turns out, we're using an even stupider model now, but throwing a lot of stuff into it and it works.

What do you think will be the next big topic for AI in quant finance?

My answer would be real world models. The real world model is an abstract concept, but it's basically a model that's aware of how the real world works. For example, if you're generating video, then the video generator should probably be aware of physics, so that the action in the video is realistic – for example, people don't just fly through the air and things like that.

The biggest problem with LLMs is that they just capture patterns in language rather than actually learning real world models. They're not learning about causal relationships between things. They're not wiring in common sense. They're simply words. So, the biggest issue is having a real world model so that your language model can, for example, make decent forecasts about what can happen under different economic conditions. It would be able to reason about it. LLMs can partially do this now because they were trained with a microeconomic analysis textbook, but they are just sampling from a domain within parameters, which is not really knowledge, it's just a knowledge of patterns. So, I think the biggest thing would be real world models that are capable of understanding causal and contextual relationships. Here's a good example - recently my friend asked ChatGPT this question. He said, “I have a car wash 200 meters away from home and I need to wash my car. Should I walk to the car wash, or take the car?” And ChatGPT said, “It's really close, so just take a walk.” This is a good example – there's no real world model in there.

What advice would you give to someone looking to enter this field?

Number one - always learn the math first. People are getting spoiled by having libraries for everything, but you're not going to be able to innovate, understand, and use them appropriately if you don't understand the math really well.

Second would be to try to get exposure to real world projects, even if they're outside of your current job. Stakeholders are really impressed if you do something cool that actually solves a problem. People tend not to like useless, cool stuff. There's a difference. So, building something that solves a problem using AI would be an interesting showcase. For example, for an AI-related role, or if you're a quant, it would be nice to build some projects within quantitative finance.

Finally, with respect to AI specifically, I would recommend paying more attention to the agentic patterns rather than the LLMs because LLMs are going to be enterprise trained. If you lose some innovation there, it's not going to be the end of the world. I don't think people are even fine tuning the language models anymore. Agentic patterns are something you will be working with a lot in the future – and by agentic patterns, I mean things like self-reflection. What it means is that first you ask them to perform some work, and then you ask this model to check its work, and if it finds something else, you ask it to regenerate, having in mind the commentary from the previous generation. This works amazingly well, so even when a model is not very reliable, if you combine it and build in an agentic self-reflection, it works much better.

Find out more about careers in quantitative finance

Download the Careers Guide to Quantitative Finance to learn more about the typical skills needed and salaries earned across six quantitative finance career paths.

Back: CQF Blog