04 March 2021 Global

Podcast Transcript: Ciamac Moallemi On Quant Investing, Machine Learning and Trading Styles

By Bilal Hafeez
40 min read

This is an edited transcript of our podcast episode with Ciamac Moallemi, Professor of Business in the Decision, Risk, and Operations Division of the Graduate School of Business at Columbia University. He gave his views on types of quant investing, the slow impact of machine learning on finance, how to think about big data, and why bitcoin and crypto is special. While we have tried to make the transcript as accurate as possible, if you do notice any errors, let me know by email.

This article is only available to Macro Hive subscribers. Sign-up to receive world-class macro analysis with a daily curated newsletter, podcast, original content from award-winning researchers, cross market strategy, equity insights, trade ideas, crypto flow frameworks, academic paper summaries, explanation and analysis of market-moving events, community investor chat room, and more.

Ciamac’s Background & Career Path

Bilal Hafeez (03:11):

And what I’d like to do with all my guests first is really to get a bit of a sense of their background. How did you end up in the area you’re in right now on the academic side? And in terms of the area of markets you focus on. Going to university, is this what you imagined you were going to do? And was this the path you were going to follow or what happened?

Ciamac Moallemi (03:29):

Sure. So, I guess it wasn’t quite perhaps the path that I imagined I would follow. When I was an undergrad, I studied math and computer science. And the math I liked was math of a continuous flavour – probability etc., if you will. And at some point, I realized that there were applications of this in finance and that you could earn a reasonable salary doing that. So, I became quite interested in that. And midway through my undergrad, I went to MIT for undergrad, I started working at a hedge fund, which was located near the MIT campus, and I ultimately became a partner there. So, I did that for about, I don’t remember quite exactly how long, but let’s say five or six years, and we traded fixed income relative value. And this is in the mid to late 90s.

The strategy was quantitative in the sense that we were trading things like interest rate derivatives that required models to analyse and to hedge and do portfolio construction and so on. But at some level it was discretionary also, like the decision on whether to put on a trade and how to size it. And so, it was ultimately a discretionary fund. I continued with that for a while and it was quite interesting. And then we hit the LTCM crisis in 1998. And we weren’t doing quite all the things Long-Term Capital Management was doing, but we had some overlap. And that was a challenging time.

And at that time, I became quite disillusioned with our trading strategy as I realized a lot of it was based on the assumption, for example, that we could consistently fund our positions and we could use the leverage to try and exploit mispricings and turn them into real paychecks. And that turned out not to be a good assumption. So, I did some other things for a while. I went to work at a biotech start-up doing computer modelling. At some point, I thought it’d be good to be a professor. And so, you need this PhD thing. So, I went there. I did a PhD at Stanford. And after that, I moved to Columbia as a faculty member.

Now, in terms of my research, I work on two things. One, I’m interested in problems where you’re making decisions over time and there’s uncertainty. This goes under the rubric of stochastic control or dynamic programming or various different names. Two, I’m also quite interested in applications. And the main applications I think about are in finance, namely things like quantitative trading, high-frequency trading, market micro-structure, cryptocurrencies. There are tools can be applied in really interesting ways. But so that’s mainly what I do now. In addition to my Columbia stuff, I’m also a principal in a small quantitative trading firm called Bourbaki LLC where we apply some of these ideas to try and deliver superior risk-adjusted returns. And so, I like to drink the Kool-Aid, if you will, or practice what I preach.

Bilal Hafeez (06:13):

How do you find having an academic career at the same time as being a practitioner? Because that’s quite unusual often there’s a bit of a separation or people move from academia to a fund or something and they stay on one side or the other or the other?

Ciamac Moallemi (06:25):

I think there’s pros and cons to both worlds. It’s not that one dominates the other. I think if you do it the right way, there’s a lot of synergies. Obviously, my work as a practitioner benefits from my insights as an academic because I’m working on similar problems. But oftentimes, it’s even more interesting and beneficial the other way. I feel a lot of the applied problems that I think about are really quite recent and have to do with broader shifts towards computerisation of markets etc. and working as a practitioner, it really gives me insight to identify problems that academics don’t even know about. And also, I think another challenge with economists, similarly, is that you end up thinking that you’re working on a practical problem, but it’s actually not something anyone cares about. Also, the assumptions you’re making may not be focused on the first order drivers of outcomes and so on. So, I think the practitioner side grounds me in reality.

Types of Quant Investing – Prediction vs Risk Premia.

Bilal Hafeez (07:19):

Okay. Now, that’s good to hear. Yeah. And you mentioned the academic side that you focus on use case quantitative finance, quantitative investing. And you mentioned earlier in the mid to late 90s, you were looking at quantitative finance back then, and quantitative finance means different things to different people. So, in one era or financial history, it was more to do with relative value and things like that. And today it’s more to do with systematic strategies, and more specifically it’s increasingly associated with artificial intelligence and machine learning. So, in your mind, how do you define quantitative investing or finance?

Ciamac Moallemi (07:55):

Yes. So, I think you hit on a bunch of the important points. I probably in the 90s would have called what I was doing quantitative investment, but I wouldn’t call it that now because there was a significant discretionary component. So, I think in fixed income relative value and volatility trading and so on, people use a lot of math and computational tools, but at the end of the day, it’s a human pulling the trigger. I think the way I would describe quantitative trading now, and certainly my primary interests now are around systematic quantitative trading where minute-to-minute decisions are made by the computer. And the role of a portfolio manager is merely to do research and to set up the models in advance. Not exercising any day-to-day discretion.

Now, within that realm of systematic quantitative strategy, I also think there’s a further division. And people have different words for this. Using my words, I think on the one hand, there’s a quantitative investing that’s a driven by predictions. You’re looking at the state of the world, you’re looking at data, you have whatever features you want, and of based on these features, you’re going to buy an asset right now because you think that the price is going to go up, or sell if you think the price is going to go down. So really you have some prediction of what’s going to happen in the future be it in a high-frequency context, maybe five minutes from now, or in a set-up context, maybe two weeks from now, or some other time horizon.

On the other hand, there are quantitative strategies that are more driven by the idea of earning a risk premium – the idea that there are factors you’d like exposure to because over a very long time-horizon, you’re going to earn a risk premium for that factor. So, people apply quantitative methods there as well. And sometimes people describe this dichotomy as difference between active trading (prediction-driven side) or passive trading (the more risk premium side). You can use different words for it. I’m mainly focused on the other former category – quantitative trading where you have conditional views that are constantly being adjusted over time, and you’re trying to position a portfolio to take advantage of that.

Why Machine Learning is Impacting Finance More Slowly than Other Domains (Like Vision and Text)

Ciamac Moallemi (09:55):

Going to your other questions to artificial intelligence. Artificial intelligence, machine learning statistics, in my mind, is basically the same core ideas maybe developed by slightly different communities. I think when people talk about AI or machine learning in more recent times, they’re very focused on particular domains where there’s been really a phase transition. And the domains I’m thinking about are things like computer vision (like recognizing faces or coffee cups or an image), natural language processing, certain types of games (playing Chess or Go etc.). And I think in those domains, there have been specific techniques associated with names like deep learning, reinforcement learning or so on that have really created a phase transition, where stuff you can do with computers now, that you simply couldn’t do like 10 years ago and certainly not 20 years ago.

On the other hand, in quantitative trading, I think it’s a much more evolutionary process. So, first of all, I think there’ve been very sophisticated mathematical thinkers in these markets for a long time. Some of the most successful people today, for example, the Renaissance guys, have been doing it for 40 years or however long. I think a lot of these ideas like deep learning and reinforcement learning are entering into this space as well, but I think it’s a little bit less of a phase transition and a little bit more evolutionary in terms of being more incremental to the things people had been doing before.

Bilal Hafeez (11:19):

Yeah. Yeah, now, that’s a very good point. I speak to lots of investors all the time, and especially people who are non-quantitative keeps saying, “Okay, we’re going to just introduce this machine learning technique. And our returns are going to be turbocharged.” Google’s doing it. And when you scratch the surface, you realize it’s a lot more complicated. What you can do with image recognition is very different from when you’re dealing with a time series. So, I guess maybe let’s go a bit deeper into this. I guess one way people frame problems to solve using machine learning is the issue of, I guess, categorisation versus regression. One way to look at this, is this an Apple or not. It’s one problem to solve versus I’m trying to predict a time series, which has different statistical properties.

Ciamac Moallemi (12:04):

My perspective is that those problems aren’t fundamentally different in the sense that both are examples of opportunities that provide learning. For example, the way you would train a program, either to predict the stock price tomorrow or to determine if it is an apple or not, is the same – you have a bunch of examples where you see the data and you see what the right answer is (i.e., Is it an apple or not an apple? What was the actual return from today to tomorrow?) and you try to build a model that is able to predict the latter from the former. Now, there are specific technical details and these things are done differently. For example, if you’re trying to predict if something is an apple or not, you would not use a linear regression. There’s something called logistic regression, which is a small modification of the linear regression, which is oriented toward that type of yes-no question. So, I think there are small technical differences, but I don’t think it’s anything fundamental.

Bilal Hafeez (12:50):

Okay. Now, that’s good to know. And then in terms of, you mentioned earlier that we’ve seen these phase transitions in your revolutions in certain areas and there’s more evolution finance. Why is that? Why haven’t we seen this revolution or this phase transition in finance?

Ciamac Moallemi (13:07):

Let’s take image recognition for an example, at some level, for humans, image recognition is an easy problem. If you’re looking at a picture and you want to ask a question, is there a coffee cup in this picture? That is a very easy question to answer. You and I would both look at it and come up with the same answer. A two-year-old child could answer that question and so on. So, at some level, for humans, there’s easy problems, they’re easy problems. And the challenge for computers was, we just didn’t have good representations for, let’s say the world of all natural images that could possibly occur. So, we don’t have good representations to understand what’s in an image. Phrased differently, that type of indirect mission problem is a high signal-to-noise regime. When you have the right representation, it’s very easy and the answer isn’t that controversial.

However, in finance, we actually do have much better representations. So, I’m thinking about ideas such as CAPM, factor models, or any of these ideas which have been studied for many years. And whether these models are correct or not, we have a lot of intuition for how prices should behave. Number one, so I don’t think we have the issue of finding the right representation, if you will. Number two, the regime is completely the opposite and in this very low signal-to-noise regime because at the end of the day, the markets are largely efficient.

So that’s the self-correcting mechanism where people identify anomalies, they trade on them, and the anomalies are armed out of the market. So, you have really, really, really small levels of signal that you’re trying to find in a sea of noise. So, in some level, it’s a bit of a different problem. Now, I don’t mean to diminish the accomplishments of new methods like computer vision, and those are obviously incredible methods and more power to those people. And also, I don’t mean to say that the ideas of deep learning and reinforcement learning don’t have a place in finance. But I do think our starting point is way different. Even before the introduction of some of these techniques, lots of people had quite sophisticated financial models whereas in the computer vision world, you were in “the land of the blind where the man with one eye is king”. You’re starting from a point where you can’t even do basic things. All of a sudden, being able to recognize a coffee cup in a picture seems amazing. Even though at some level it’s not a hard problem.

The Pros and Cons of Using Linear Regressions

Bilal Hafeez (15:22):

Yeah. Yeah. Understood. Okay. That makes sense. And so, if we talk a bit more about the toolkits that you think is appropriate for financial markets. I imagine most listeners to this podcast are familiar with linear regressions, even they just score regressions, probate regressions, in this statistical econometrics world. A lot of people probably did courses in that at school and use it in their day-to-day jobs. And they may use principal component analysis and all the stuff you use in an econometrics world. So obviously that’s one way of making predictions. Now, what do some of the machine learning techniques bring to the table that you can’t do in econometrics?

Ciamac Moallemi (16:02):

So, I think linear regression has a lot of nice properties. It has nice computational properties. It has nice theoretical properties and you get things like confidence intervals at P-values. And some of these things are maybe a little controversial and you have to make some assumptions, but it gives you a lot of structural insight when your regression is accomplishing two things. Number one, it’s making predictions. Number two, it’s giving you information, for example, about what variables are more or less important than others, and what variables are statistically significant, what is the direction of the relationship between a particular X variable and the outcome you’re looking at and so on.

Now, in the machine learning world, you throw out all that second part. You say, I don’t care about why the prediction is coming from. I just want to focus on making very accurate predictions. And so, there’s more techniques that are available. They are not reliant on having linear relationships, and many relationships are indeed not linear or perhaps it’s a difficult to identify how to come up with features so that they become linear. And so, you gain a lot more flexibility that way. But that said, I think if you had to give me one predictive tool, linear regression would be the one I pick.

I think in quant these days, if you want to be competitive, you have to innovate. And I think people try to innovate in two ways. One might be along the lines of what we’re talking about, which is to apply more black box, machine learning type of methods to identify maybe more complicated relationships that do not fit in a linear form. The other might be to leverage data that has not been used in the past. So, for example, probably most people in the past and this systematic quantitative space really been using technical data, things like prices and trades and volumes and things like that.

Well, there’s this whole world of alternative data, whether it’s leveraging satellite images of parking lots to understand how a retail stock is going to do, or looking at credit card data, or natural language processing news and so on. I think another area where people try to innovate is to obtain data which gives a competitive advantage. If you have the unique data, you may not need any of this fancy machine learning stuff. Linear regression may work just fine. But on the other hand, if you’re going to try to work with data that other people have, then in order to have an edge and identify something novel, you’re going to have to try techniques that are beyond the generic.

The Advantages of Machine Learning in Non-linear and Complex Markets

Bilal Hafeez (18:18):

Yeah. And so well, I’ll come back to on the data point, but in terms of the techniques, obviously there’s a range of machine learning techniques, neural networks, forests, decision trees, type approaches, deep learning, ensemble type models and all of these sorts of things. What do you generally think, if you do want to set up a machine learning-based quantitative process, what are the techniques that you think are fairly reliable to work in financial markets? And then which techniques you think are, at this stage, it’s unclear how useful they are?

Ciamac Moallemi (18:55):

Yeah, I honestly don’t have a good answer to that very specific case. My philosophy would be to try with the most basic stuff first like just even linear regression and then build on top of that. At a high level, different techniques have different kinds of trade-offs. Things like neural networks are pretty good at capturing very nonlinear relationships. Things like decision tree-based methods are very good at that dealing with lots of variables and getting rid of the feature selection problem. I think there’s pros and cons to all of these. I don’t think there is a unique answer. I think the answer is you got to try it.

Bilal Hafeez (19:28):

Yeah. And one issue, I guess people find with some of these techniques is what you talked about earlier, which is that you don’t really know the why. It is a black box. How much of an issue is that for you? Because someone like me, yeah, I like to know why the model is doing what it’s doing. And so, I feel very uncomfortable when I’m confronted with a super black box. So how do you go about dealing with that issue if you do think it’s an issue, maybe it’s not an issue?

Ciamac Moallemi (19:54):

Right. So, I would lift it up a little bit.

Bilal Hafeez (19:56):

Yeah.

Ciamac Moallemi (19:57):

I think the biggest first-order issue in making predictions of future returns is overfitting. In this regime where the signal is very weak, the market is largely efficient. And from the perspective of mathematical modelling, that means that future prices are a little bit of signal and a lot of random noise. And when you fit models on historical data, what you need to avoid is fitting to the random noise because then you roll out that model on production. It’s not going to work.

Now, one heuristic that people often use to avoid overfitting is to try and see, like try and ask the question, does this make sense? Can I come up with a story? Like, okay, the model is coming up with these predictions, but maybe if I’m doing something like linear regression, I understand what those variables are. I can point at a certain structural supply demand balance, which is creating this anomaly and so on. So that’s one mechanism to avoid overfitting. And certainly, I think if you can come up with an explanation, if you have a structural understanding of where those predictions are coming from, that’s fantastic.

On the other hand, there’s probably many anomalies out there where what’s really going on under the surface is either not visible or it’s just so complex. There may not be a clean story. And by requiring a structural understanding, you’re putting yourself at a competitive disadvantage to others who don’t care. Because at the end of the day, if you buy an asset, the price goes up and then you sell it, you have made money. Well, as long as it consistently goes up and you’re not over-fitting, I would argue that that’s fine. So, I think the universe of anomalies that, it’s possible to come up with an explanation of is much smaller than the universe of anomalies that may exist in my mind.

If you’re in the black box world, on the other hand, you have to think about other ways to address that challenge of over-fitting. So there, I think you want to start to be really careful about having a disciplined process in your research, really careful about having a separate training sets and validation sets and test sets and so on, really careful about the how often you go out of sample. And you want to try to use other techniques to manage overfitting.

Bilal Hafeez (22:02):

Okay. Yeah. One thing I have found in this field is a lot of peoples out of sample becomes in sample very quickly for them. I guess that’s where the discipline comes. You have to be honest with yourself and your process rather than trying to contaminate your different sample periods.

Ciamac Moallemi (22:18):

That’s right. And I think my impression is that the most successful people are very, very careful about having a research process in place that has that discipline.

How to Think About Alternative and Big Data

Bilal Hafeez (22:26):

Yeah. And you mentioned data earlier. And when you’re using data, obviously there’s lots of considerations, especially in the alternative data space in terms of, one, quality of the data. Second, how much work you need to do on the data, whether there’s lots of noise in the data, is that seasonally adjusted, not seasonally adjusted, what are you capturing exactly? What are the types of considerations you use when you come across alternative data sets?

Ciamac Moallemi (22:52):

I think the types of things I would think about when evaluating an alternative data set are things like what you said like, so for example, maybe how clean is the data. Other considerations may include how applicable is the data? I had briefly mentioned satellite images of parking lots. Well, that works for certain class of large retailers like Walmart and Costco and so on. So, if you’re going to invest all this money and all this modelling effort in some data, but you’re only going to make predictions for your 50 names out of a universe of 5,000 names or something like that, then you got to think about your trade-off. I think another thing is to think about how unique that data is and how many other people have it. Because it’s really unique data where you’re the only one that has it. That that’s obviously a worth more.

Yep. Yeah. Okay. Yeah. And in terms of, you often hear that some big data. And when I think of big data, I think of, okay, big for me would be something that you can’t really use Excel for. That’s what I embrace a simple heuristic. How do you think about big data because you hear that term bandied around everywhere, and are there specific challenges with using such large data sets?

So, I do use Excel, but not really for quantitative modelling. Certainly, any data you’re going to work with in this context, you’re beyond Excel. There’s a couple of markers that make the data sets harder to work with. So one is, if your data does not fit into the memory capacity. Many algorithms work very well if you can fit all your examples into a number of historical examples, but once you exceed that, it’s not clear how to proceed. There’s also the related issue of when the data gets big enough that you need to use more than one TPU to analyse it, and indeed even maybe more than one computer.

The worst-case scenario would be data that’s so large that you have to score it on disk. For the processing, you have to use many cores across many different machines and that creates engineering challenges. So, for example, just to go back to the methodologies, one nice thing about neural networks is on the engineering side, you can work with systems like that. Neural networks or stochastic gradient descent are easy to parallelize. They’re easy to work with for larger data sets and so on. Contrast that with something like trees it’s really more oriented towards data sets that where you can fit everything into memory and work on a one computer and so on.

Two Approaches to Portfolio Construction

Bilal Hafeez (25:18):

Yeah. Okay. Understood. Yeah, it’s interesting. You mentioned the engineering challenges, because one thing I do find in the quantitative space, the practical side of it becomes very important, how you set up your CPU’s or service and so on and all. And also, how you manage your data, the data engineering side also seems to become very important too. It’s not very glamorous, yeah. I was speaking to a friend of mine who works in machine learning at one of the big tech companies. Is that what he values a lot of people who work in statistical offices because they really understand the noise in the data and how to manage the data itself. Because obviously if you have a bad data sets, then it doesn’t matter what you do on top of it, but it’s not glamorous. It’s not the most glamorous end of the market. And moving on then to a more general point, when you do have various models that you think deliver alpha of some kind, how do you think about a portfolio construction or combining models together?

Ciamac Moallemi (26:16):

It’s a very important topic, but maybe before I dive into that, I can give a little bit of a higher-level view because I’m not sure everyone understands this type of approach you’re alluding to. So, I think there’s two broad approaches to developing quantitative trading strategies. One which is I’ll call the end-to-end approach, which is, you identify some data or some market feature or something that maybe has predictive value, and then you optimize a strategy directly on top of that.

So, for example, you may have this variable you’ve identified, you set up the ops around it, set up levels at which you’re going to get into the trade and how much and so on. And you optimize all this with back testing. And I call that end-to-end because you’re starting with the relevant information. And the way you’re measuring whether you do well or not and figure out how to adjust your parameters and so on is really looking at your objective, investment performance. Ultimately, that’s the you care about. So, that’s the end-to-end approach.

On the other hand, in the decomposition approach, you try to split it into some sub-problems, which are a little bit more manageable. And I think this is what you were alluding to. So, first step in the decomposition approach would be, let’s make predictions. We’re buying because the thing is going to go up. We’re selling because we think the price is going to go down. So, let’s build predictive models of what’s going to happen in the future. And the output, maybe predictions of relevant things. So, for example, what’s the return going to be? Maybe to fixed horizon, maybe a whole trajectory, also things like volatility and so on. And so, you get a bunch of predictive outcomes and that gets fed into the next step, which is portfolio construction, right?

Bilal Hafeez (27:50):

Yeah.

Ciamac Moallemi (27:50):

So, portfolio construction would be, let’s set up an optimisation problem where we optimise a portfolio which is going to be well-placed to take advantage of these predictions if they do indeed occur, if those predictions are realized, while controlling risk so that we don’t lose too much if they don’t occur, with position limits and leverage limits and all the kinds of constraints you want. And then there’s a third step, which is okay, this is the target portfolio. Now, what do I actually do? How do I trade it? And I go to a broker and put in an order, or should I go on the markets and take liquidity? Or should I trade in a dark pool? What should I do?

The Importance of Incorporating Costs

Ciamac Moallemi (28:30):

And the interface between the last two components of portfolio construction and execution is really something like a cost function or a price impact function. Because when you’re building your portfolio, you have to anticipate how much it’s going to cost to realize this portfolio. Even if the price is going to go up a lot, it’s going to cost even more to get into that trade. It’s obviously you don’t want to do it.

So again, two approaches, this end-to-end approach where you do everything in one shot and this decomposition approach where you break it into steps. They have their pros and cons. The end-to-end approach, again, the nice thing is the ultimate objective. The way you’re fitting your models and measuring if you’re doing well, is your investment performance like risk-adjusted return, or a Sharpe ratio, or something like that. And that’s your ultimate goal.

Now, the downsides of the end-to-end approach is it’s not scalable. So, you have one giant air ball of a strategy. Let’s say you come up with an idea for another source of alpha, how do you add that in? You have to optimize, re-optimize the whole thing. It also tends to be less data-efficient. I mentioned before, a number one issue with quantitative trading is overfitting. When you optimize in back tests, it’s like you have a bunch of samples of what’s happened in the past. And because you’re looking at your P&L, those samples are equally weighted. They’re weighted by your trade size. You might’ve had some strategy in the past. And if you, let’s say you run a five-year back test and you look at its performance, that might be dominated by a relatively small fraction of the time where you put on large trades. So, at some level, you’re underweighting the other instances. And that basically means you have less data.

Bilal Hafeez (30:10):

Okay. Yeah. Yeah. So, it’s not unlike for like. So that back test isn’t actually a true back test then?

Ciamac Moallemi (30:15):

It’s just less data efficient versus the other approach. In the decomposition approach, if I’m fitting a model, I’m going to focus on predictions. So, there might be instances in time where my prediction is small, and in a back test I wouldn’t have gotten into a trade, but if I look at, okay, my prediction is this price, this stock is going to move zero-basis points. If indeed it moves zero basis points, then I should use that as extra evidence that my model is good.

And so, this decomposition approach uses the data better and is also more scalable in integrating outputs and from different signals. This is a very competitive space. Many people, many smart people with a lot of resources are out there. And these anomalies that one typically finds are very small. So maybe you have one anomaly, you identified using this data set of satellite data. You may have another anomaly you identified using some other data sets and another one using some machine learning method. What you want to do is, you want to add these up, and trade when all these signals are aligned in the same direction.

And maybe for any single one of these signals you wouldn’t exceed transaction costs. But if you now have a dozen signals, and seven of them are saying the price is going up and you add them up, then now you’re exceeding transaction costs. So, in that way, there’s a little bit of increasing returns to scale. Anytime you put on a trade, you have this transaction costs you have to pay back. First signal might not get you there, but as you add more of them together. Once you’re above transaction costs, if you come up with another signal, then that’s free money, right.

Bilal Hafeez (31:44):

Yeah.

Ciamac Moallemi (31:44):

You’ve already paid the fixed cost of getting into the trade. So, having the decomposition approach, if you’re a large firm, you can have a whole bunch of researchers working on alpha, and they can be working on very different alphas with different data sets and different techniques and so on. And you can try to coherently combine them and have a strategy, which basically more effectively monetizes than any single model because it trades when these signals are aligned.

The Challenge of Alpha Mixing in Portfolio Construction

Ciamac Moallemi (32:14):

However, that creates a number of interesting issues. The first one challenge which you brought up, which I think challenges practitioners, is very under poorly understood academically – I can’t think of papers which address this. It’s this challenge of alpha mixing. So, I have dozens of different alphas. Again, people may vary, but I’ve heard that large quantitative firms that you might think about may have tens to even thousands of different individual types of alphas. How do you combine them into a single coherent view of the market? There’s a challenge there.

The second challenge is, if you have teams working on different kinds of alphas, how do you manage correlation between the different alphas? So, if I have one alpha that is a momentum alpha, and I have some other alpha based on data on parking lot traffic and they turn out (hypothetically) to be 90% correlated. This new signal I discovered is useless. Basically, I’m uncovering the same anomaly I found before. So, you want to try and think about setting up your work in a way that when you’re doing your research, you want to find new alpha that’s uncorrelated to what you’ve found before. And sometimes, that’s a challenge because maybe the things that are easiest to discover are the things you’ve already found.

So, I think there’s a couple of challenges there. There’s also, I think, interesting implications in terms of the competitive landscape. This idea that there’s increasing returns to scale because of needing to exceed this fixed cost, this idea that you’re going to need many sources of alpha and that you’re going to combine them, this really favours firms which are structured to have a handful of trading strategies with many people working on them. For example, Two Sigma has a handful of trading strategies. But with each of those trading strategies, they might have, I don’t know how many, but probably dozens of people working on alpha, another set of people working on portfolio optimisation and so on. Renaissance, famously is structured around a handful of trading strategies and so on.

I contrast that with places which are more like in the pod shops, where you have completely independent PMs. Maybe a team of one or two or three each doing their own thing, not coordinating in terms of combining their alphas to unlock synergies, not managing correlation. Maybe you have a dozen different PMs, but if they’re all 90% correlated, might as well just take the first do 10X the positions. So, I think there’s interesting structural issues in terms of which competitive model will be more successful.

Understanding Time Horizons of Different Markets

Bilal Hafeez (34:40):

Yep. Yeah, absolutely. And how do you think about time horizons of models? Because you could have a model which work at the one minute level of five minute level, or you can have a model that works at two weeks or at two months, or at, maybe even two years. Is time horizon something that would lead you to differentiate the way you treat those models, or do you just view them similarly?

Ciamac Moallemi (35:02):

No. You need to treat things with different time horizons differently. So, let me give you an example of a common predictive factor called order book and balance. This is well studied. You can read papers about it. Basically, if you look at the limit order book for NASDAQ, and there’s more buy orders and sell orders, the price is probably going to go up. Everybody knows this, the effect is weak. You’re not going to make back your transaction costs. But again, in combination with all these other things, that might be one more alpha you throw in. Now, something like order book and balance is realized very quickly. Maybe a minute if not less. Right.

And let’s contrast that with another alpha based on a model of floor traffic at Walmart and you may be able to make an accurate prediction of how Walmart is doing. How long is your prediction going to take to be realized in data? You may need an external catalyst – Walmart reveals how they’re doing and you were right. So, you have one alpha that’s realized over a minute, one alpha that might be realized, let’s say, over two months.

And so, if I just add those two and say my composite prediction over the next two months is the sum of the first prediction and the second prediction, that misses the first alpha. I have to trade on that right now. If I don’t trade on that right now, it will be gone 10 seconds from now. So, I need to be aggressive about a high-frequency alpha. On the other hand, something that’s two months, I can be very slow into getting into that trade. And indeed, maybe I should because I’ll get into it more cheaply if I trade slowly. So, you can’t just naively combine these. You have to maybe think more about trajectories.

When we think about time horizon, so that’s one angle, which is that a lot of these market anomalies (alphas) have their own intrinsic time horizon. There’s some underlying phenomenon and it takes a certain amount of time to get a realized into the price. And now the second thing is, that the assets you trade also have their own intrinsic time horizon. And this is a function of things like, let’s say volatility, liquidity, and/or risk aversion etc. But let me focus on one of those because I think it’s the most clear -liquidity. Let’s say you’re trading US equities – let’s say you’re trading the S&P or the top 500 Large Cap US Equities. If you look at the trading costs for the S&P, they’re significantly smaller than the trading costs for 500 individual stocks simply because it’s more liquid. And the time horizon you should care about on the trading side has to do with the liquidity. At an extreme, if there were no transaction costs, if it was free to trade an asset, your time horizon should be very small, almost instant, right?

Bilal Hafeez (37:30):

Yeah.

Ciamac Moallemi (37:31):

Because if you think over the next incident time, it’s going to go up, you should buy. And then you’ll revisit that because you can sell later and it’s just going to cost you nothing. So, if there’s no cost getting in and out, your time horizon should be very short, right?

Bilal Hafeez (37:42):

Yeah.

Ciamac Moallemi (37:42):

On the other hand, if an asset is less liquid, if there is a significant cost to getting in and out, now you’re going to start to care about making a prediction at a longer horizon. Because it costs money to get in and get out. And so even if you’re like, let’s say in a restrictive world of, let’s say top 3,000 US equities, then the intrinsic time horizon for something very liquid there versus something very illiquid might be different by an order of magnitude. You want to trade them in a consistent, coherent way. It does not make sense to trade Apple with a typical hold time of two weeks and also do that with a much less liquid stock, given that it’s very cheap to get in and out of Apple with in large size. I think this is most extreme in equities, which is mainly my expertise. And as I had mentioned, the assets also have an intrinsic time horizon.

So, I think both from the predictive side and on the trading side, you really want to think about it, not in terms of, okay, I’m going to focus on what’s going to happen in the next two weeks, but you really want to think about it as a trajectory. Not just what’s the end point, but what’s the path. And it’s interesting mathematically thinking about to do that.

The Trend to Winner-takes-all With Quant Investors

Bilal Hafeez (38:48):

Yeah, absolutely. And so, if you step back and you look at the financial or the investment community today, what do you see as some of the key trends that are emerging in the quantitative investment space?

Ciamac Moallemi (39:02):

So, I think we’ve touched on one of them, which is these increasing returns to scale that we see as well in other technology businesses like in the high-tech sector. There are incentives for winner-take-all dynamics where you’ll have a handful of large firms who are very successful at the expense of others. In the quantitative trading space, again, people are very secretive about their returns in their investment flows and so on. So, it’s hard to say, but I think you can see this more clearly in the high-frequency space. I’m not a high-frequency trader and I don’t have special information, but my understanding is that basically Citadel was crushing everyone in that space. Basically, Citadel Securities is taking hypothetically 70%, 80% of the profits in that space. And again, I think it’s because there’s increasing returns to scale to combining these various ideas. So, one phenomenon I think is we’re going to see is the consolidation and winner-take-all dynamics. In the same way, Google crushes everyone else in search.

I think another phenomenon we’re going to see is larger reliance on computation. And I think this is something already we see in the more traditional AI world – the world of computer vision and playing chess and so on. And that AI world, there is now a paradigm where, look, if you have good ideas and you can be clever and so on, that’s great. But another way to achieve performance is simply to throw a lot of computer to problem. And so, the numbers are really stunning.

For example, DeepMind is a unit of Google that does a lot of this stuff, who’s famous for their computer players for Chess and Go etc. If you try to replicate one of their papers, for example, their AlphaZero programme, which is a network player for some of these games, even without the research cost – even if you just wanted to implement the final system that they came up with compute time on AWS or whatever, it would cost $20 million, $30 million.

There are enormous sums of money being invested in computation. Again, I don’t have a special information, but I’ve heard it speculated that Google, which perhaps owns the largest computational infrastructure on the planet (data centres everywhere, lots of computers) is deploying it on machine learning models. What do they do with all these computers? Is it doing search? Is it serving ads and so on? I’ve heard that their number one workload is actually training machine learning models. And so, I think you’re going to start to see that in finance as well.

I heard an anecdote about a one large hedge fund (which we’re all familiar with, but I won’t name) where a typical quant alpha researcher is allocated 10,000 processors. So, they have a running budget of 10,000 processors to run various types of back tests and do a great sweep over parameters and try data sets, whatever you want. I think another major trend is we’re going to start to see is where people try to, (again, if you can get away without all that CPU, that’s great) get an edge by buying computer time. If you’re Renaissance Technologies and you have a ton of money, that’s a no brainer. You’re going to buy a lot of computing time. So again, I think that will feed into a more and more consolidation because not many people can afford it.

Bilal Hafeez (42:10):

So where does that leave boutiques then? Is there a role then for boutiques or smaller funds then?

Ciamac Moallemi (42:16):

I think you’ve got to find a niche where you’re doing something different that may be unattractive to people. Oftentimes, people think of it in terms of capacity versus, let’s say Sharpe ratio. There may be strategies where you can do very well, but are very capacity limited. For example, a strategy might make $10 million a year. Well, if you’re like Renaissance technologies, that’s probably not going to move the needle. It may not register for Jim Simons to make an extra $10 million a year, versus if you’re an individual, that might be a fantastic amount of money. So, there might be niches that are not interesting to these bigger players where smaller people can have an advantage.

Why Bitcoin and Crypto Technology is Special

Bilal Hafeez (42:56):

Yeah, understood. And at the top of our conversation, you mentioned you had some interest, or you had some interest in Bitcoin or some focus on Bitcoin. How do you view that? So, it sounds like you think that it lends itself well to concentrate your process. That’s what I inferred from your earlier…

Ciamac Moallemi (43:14):

My interests so far in Bitcoin have been on the underlying technology. I think at a technological level, it’s totally amazing. It’s amazing that you can have hundreds of billions of dollars of exchanged every day without recourse to any legal system, without banks involved and so on. And it’s all done via a very clever set up of incentives. Everyone is incentivized to us to do the right thing. So, at that level, I think it’s amazing. As an academic, I think there’s lots of interesting questions to ask in terms of how does the system work, can it work better?

So, there’s a number of issues. It’s amazing that it works, but it’s wildly inefficient. If you look at how much money is it costs to operate the Bitcoin system in terms of what is paid to minors, the sums are staggering. And so, could we do that better? This is some operator, so on. That’s one question. I think another thing that’s quite interesting is what people call the DeFi space (decentralised finance) where you have financial products which have emerged, that are quite different from, have a very different flavour than conventional products. For example, the typical market structure in traditional markets (equities, futures etc.) is a centralised electronic limit order. That’s the way most markets are organised.

In this decentralised finance world, that model doesn’t work. If you’re on a blockchain, you’re in an environment where computation is very constrained, and you can’t have people inserting all these orders and cancelling them. So, the structure that’s emerged there that does not involve exchanges and is completely decentralised, is this thing called automated market makers. Maybe you’ve heard of things like a unit swamp or so on. They’re organised quite differently. And I think they’re not understood. And so, I think it’s super interesting to me to think about that.

Again, from an academic perspective, crypto and blockchain is interesting at many levels. At the lowest level, how do these things operate? Who’s paid what, how do we know they’re secure? Lifting up the cover a bit, you start to see differences in those kinds blockchain-oriented markets. Because they operate in blocks, they’re fundamentally discreet. In most financial markets, time is more or less continuous. You can cancel your limit order at any instant in time. But if you’re operating in automated market maker, let’s say on the Ethereum blockchain, and Ethereum block is generated, let’s say, every 10 seconds, its time is discrete. That’s fundamentally different. Computation and storage on blockchain are very restricted. At an even higher level, the completely decentralised markets are organised differently. I think there’s a lot of interesting things there. So, that’s my interest as an academic.

Taking the perspective of trading in these markets, I find that a bit more challenging because there are a lot of issues that we don’t think about, things like counterparty risk and regulatory risk and so on. Some of these things are so complicated, the decentralised mechanisms. People talk about Money Legos, but at the end of the day, you have an incredible complexity and it’s hard to reason about them. So, I’ve been more interested in understanding what’s going on and coming up with models on what’s the right way to think about some of these products, which I think we don’t have yet, rather than actively trading these products. They’re really quite different. And I’m not sure everything I know about quantitative trading and equities and futures and so on maps over to that space.

What Quants are Looking for When they Hire

Bilal Hafeez (46:56):

So yeah, I know there’s also really good points that you made there. But yeah, it really does feel like this is a true innovation in the systems. It’s quite revolutionary in many ways. Another question I wanted to ask you as we round off our conversation on finance is, what qualities do you look for in somebody you were to hire into a fund in the quant space? Because obviously, there’s been big advances in computing and so on. You have the finance crowd, you have the maths, engineering, computing guys. What qualities do you look for if you were to hire people into a quant fund?

Ciamac Moallemi (47:32):

That’s a tough one. I think at a certain level you need a mathematical sophistication, but if you have someone who’s too mathematical, that can actually inhibit your thinking and make things too rigid. There’s are different kinds of roles, but maybe the most common role would be like a quant researcher. I think you almost want experimentalists. It’s almost like you have somebody doing physics or chemistry, they’re doing experiments, and they’re being very careful, and coming up with theories, hypothesis, trying to prove or disprove them, except the experiments are not in the lab, but are computational and data-oriented. So obviously, I’m not saying go out and hire a chemist, but I think you want that research mentality, someone who’s very careful and is organised and thoughtful in the way that they do research.

Bilal Hafeez (48:19):

Yeah. Yeah, that makes sense. Yeah. Yeah, what I found when I’ve worked with quants or hired quants is the risk of having somebody who’s from a very theoretical backgrounds, and when you put them into a quant role, often that leads to all sorts of challenges. On paper and in reality, they are very smart, but it’s a different type of smarts that doesn’t lend itself as well to messiness of market sometimes.

Ciamac Moallemi (48:43):

Yeah, and I think, again, computational side is more and more important. So, if you have someone who’s very mathematically-oriented and solid and so on, but they’re just less facile with quickly iterating and trying a bunch of things computationally, that person is going to be at a disadvantage. So, I think increasingly, and even putting aside the quant thing, even at Columbia, PhD students we select for are mainly very mathematical. The papers are going to write or have a large theoretical component and there’s steer proof style stuff and so on, but it’s becoming increasingly important to have that computational skill because a lot of what guides the theory is being able to iterate computational experiments. And I think it’s the same in the quant space.

I think it used to be that you could be a bright undergrad, and maybe not even in a quantitative major, but just be quick on your feet and get hired at a place like D. E. Shaw or whatever. But now, what I see is that the people getting hired not only have much more relevant experience, like a quantitative degree, but also things relevant to probability and optimisation and so on. But in many cases, they’re going after people who have specific finance work like maybe you’ve done a research problem in quantitative finance, or maybe your advisor with someone like me who works on those problems. So, I think that space has started to become very competitive and specialized. 20 years ago, that was not the case.

Ciamac’s Productivity Hacks and Book Picks

Bilal Hafeez (50:06):

Now, that’s true. And just now on a more personal level, I like to ask a few personal questions as well. One question I’d like to ask is, how you manage your information or research flow because presumably, you’re actually straddle lots of different worlds, so there’s papers being thrown at you all the time. You probably have your own research, and there’s financial news coming through all the time, new techniques have discovered new data. How do you manage and curate all of that?

Ciamac Moallemi (50:31):

My number one piece of advice would be to write things down and take lots of notes. I have dozens of research notebooks, both for my academic work and my practitioner work. You sometimes have sparks of inspiration and easy to forget about that and so on. So again, different people have different processes. Maybe you use textiles on your computer. I like written notebooks and so on, but try to capture ideas and then revisit them.

In terms of incoming information flow, this is a big challenge, and I don’t know that I have a good answer and I’ve managed it well. On the one hand, let’s say, I’m interested in things like a cryptocurrencies and blockchain. This area is evolving so rapidly that in order to be relevant, you have to work on the hot topics now because maybe the thing you thought was interesting six months ago, nobody cares anymore. It turned out not to be a good idea and the market has moved on. On the other hand, in my experience, if you ingest too much information, then there’s a chance, a lot of that is noise and you get confused. Also, there’s a chance that your ideas end up being too conventional. Because you’re being pushed by the same environmental drivers as other people. So that’s not a satisfying answer. There’s the trade-off. And I don’t know how to handle that.

Bilal Hafeez (51:37):

Yeah, now, there’s reasonable points you make there. And then finally, I’m a big reader of books. And so, is there any book or books that have really influenced you in the way you think about markets or quantitative finance?

Ciamac Moallemi (51:51):

So, let me give three books. I think the first two books are academic books, which are a more just methodology-oriented, but I think talk about the right methodology. The first book would be The Elements of Statistical Learning by Friedman, Hastie, and Tibshiran. That’s a practically-oriented book on different types of statistics, machine learning, whatever, in terms of the pros and cons of some of these different methods and how they work. It talks about things like overfitting and so on. On the predictive modelling side, that’s maybe the most important thing.

In terms of thinking of the control side, on how to make decisions, there’s a series of two books called Dynamic Programming and Optimal Control, Volumes I and II by Dimitri Bertsekas at MIT. And in terms of that being one of my research areas, this area of dynamic programming, I think those books give a lot of mathematical intuition of making trading decisions. How can I understand what future consequences of that decision are?

And then lastly, I think I’ve read a lot of practitioner books as well. The book that’s been most influential for me in terms of the way I think about the markets is Active Portfolio Management by Grinold and Kahn. I think that captures a lot of this scalable decomposition approach I talked about in terms of giving intuitions of building predictive models and the portfolio construction side. How do you think about transaction costs and important concepts like information ratio and things like that – diversifying over time, diversifying over space, so on. It’s a little bit old now, but that would be my go-to in terms of how to philosophically think about quantitative trading.

Bilal Hafeez (53:28):

No, no, that’s really good. And we’ve had a lot of food for thought in this conversation, a lot to learn. And if people wanted to follow your work, what’s the best place for them to go to see your work?

Ciamac Moallemi (53:37):

They can follow me on Twitter. I’m @ciamac, C-I-A-M-A-C, my first name. Also, my website is moallemi.com, M-O-A-L-L-E M-I.com. I’m the only Ciamac Moallemi on the internet. So not hard to find.

Bilal Hafeez (53:51):

Yeah, I have to admit, I did have to ask Ciamac hard to pronounce his name earlier as well. So yeah, I do imagine you has a very unique name. So, with that, just thanks a lot. It was an excellent conversation I had with you. And I look forward to seeing more of your work and staying in touch hopefully as well.

Ciamac Moallemi (54:06):

It was fun to be on.