<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Zero Lag Club]]></title><description><![CDATA[A pragmatick grimoire for forging high‑performance quant and chain‑wrought trading engines—sans intermediaries; naught but thee, thy code, and the markets.
None of these scrolls be financial advice.]]></description><link>https://zerolag.club</link><image><url>https://substackcdn.com/image/fetch/$s_!owNg!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a75c51f-f066-4f0f-be3a-17dd821797b3_241x241.png</url><title>Zero Lag Club</title><link>https://zerolag.club</link></image><generator>Substack</generator><lastBuildDate>Fri, 10 Apr 2026 11:31:11 GMT</lastBuildDate><atom:link href="https://zerolag.club/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[ZeroLag, LLC.]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[zerolag@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[zerolag@substack.com]]></itunes:email><itunes:name><![CDATA[crypt0grapher]]></itunes:name></itunes:owner><itunes:author><![CDATA[crypt0grapher]]></itunes:author><googleplay:owner><![CDATA[zerolag@substack.com]]></googleplay:owner><googleplay:email><![CDATA[zerolag@substack.com]]></googleplay:email><googleplay:author><![CDATA[crypt0grapher]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Lecture 6: The Grossman-Miller Market Maker: A Pragmatic Treatise on Liquidity Provision]]></title><description><![CDATA[A practical series for the discerning retail trader and the quantitative alchemist on Market Microstructure]]></description><link>https://zerolag.club/p/lecture-6-the-grossman-miller-market</link><guid isPermaLink="false">https://zerolag.club/p/lecture-6-the-grossman-miller-market</guid><dc:creator><![CDATA[crypt0grapher]]></dc:creator><pubDate>Tue, 16 Dec 2025 23:04:37 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/ba63ff39-c428-4038-a807-b3f9f6820f89_1024x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>&#128367;&#65039;Greetings, esteemed reader!</p><p>It&#8217;s been a while! Hope you are well!<br>Have you ever pondered the plight of the market maker?<br>The &#8220;invisible hand of the market&#8221; is the one who stands ready to buy when others wish to sell, and sell when others wish to buy. This is a curious activity, one that requires both fortitude and a most sophisticated understanding of risk.</p><p><a href="https://zerolag.club/p/lecture-4-the-glostenmilgrom-market">The Glostein-Milgrom model</a> we reviewed&nbsp;last time clearly explains trading with  informed and uninformed traders. Long story short, market makers miss out on trading opportunities to informed traders, and to recover losses and earn profits, MMs trade with liquidity (noise, retail) traders, creating a combined effort to balance the market ,recoup losses, and book profits.</p><blockquote><p>Highly recommend the very first article on adverse selection and the maker maker business - the one published by Jack Treynor aka &#8220;Bagehot&#8221;, in 1971, it&#8217;s called &#8220;The Only Game in Town&#8221;, - clear and concise, and not a single formula was used.</p></blockquote><p>Modelling participants with whom to trade is one part of the MM story (yielding a mathematical proof that one should always buy from uninformed traders, which we did at the Glostem-Milgrom lecture). The other crucial aspect is <strong>inventory management</strong> - as an MM, we don&#8217;t want to hold assets longer than necessary. </p><p>The Grossman-Miller model addresses exactly that. </p><p>It all started in the year of our Lord 1988, when Sanford J. Grossman and Merton H. Miller illuminated this mystery with their seminal work on market making and liquidity provision. Their model reveals how market makers extract compensation for their services whilst managing the inventory risk.</p><p>Fear not! For we shall not merely theorise&#8212;we shall implement! By the article's end, you shall possess both the mathematical prowess and <a href="https://grossman-miller-simulator.zerolag.club/">the playground to simulate your very own market-making operation.  </a></p><p>&#128293; Shall we commence?</p><h2>The Market Maker's Fundamental Dilemma &#127917;</h2><p>Consider, if you will, the market maker's predicament:</p><p>1. <strong>No intrinsic desire for inventory</strong>: Unlike a merchant who stocks &#8220;assets&#8221; (to be fair, shitcoins, primarily) for eventual profit, we, as a market maker, have no inherent wish to hold these stinky bags at all.</p><p>2. <strong>Temporal mismatch</strong>: When accepting one side of a trade, we must wait&#8212;sometimes interminably&#8212;for a counterparty to materialise.</p><p>3. <strong>Price risk exposure</strong>: During this waiting period, the cruel hand of fate may move prices against us.</p><p>This triumvirate of challenges forms the core of what Grossman &amp; Miller sought to model. There are more modern models addressing these challenges. In this post, let us start with how G&amp;M approached this most vexing problem.</p><h2>The Grossman-Miller Framework: A Three-Act Play &#127914;</h2><p>Let&#8217;s simplify the drama unfolding timeline down to the three time periods.</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;t \\in \\{1, 2, 3\\}&quot;,&quot;id&quot;:&quot;YBRHKLHRSS&quot;}" data-component-name="LatexBlockToDOM"></div><p>With a cast of characters:</p><p>- <em><strong>n</strong></em> identical Market Makers (<em><strong>MMs</strong></em>): Our protagonists, initially holding no assets but armed with initial wealth <em><strong>W_0.</strong></em></p><p>- <strong>Liquidity Trader 1 (</strong><em><strong>LT1</strong></em><strong>)</strong>: Arrives at time <em><strong>t=1</strong></em> with <em><strong>i</strong></em> units to trade (that is, if <em><strong>i</strong> </em><strong>is </strong>negative, they are selling <em><strong>i</strong></em> assets, if positive, then they are here to buy).</p><p><strong>- Liquidity Trader 2 (</strong><em><strong>LT2</strong></em><strong>)</strong>: Appears at <em><strong>t=2</strong></em> with exactly <em><strong>-i</strong></em> units to trade (what a serendipity)</p><p>Now let's start cooking. We assume that <strong>all participants are risk-averse, </strong>more precisely, they exhibit risk aversion with the utility function:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;U(X) = -\\exp(-\\gamma X)&quot;,&quot;id&quot;:&quot;TUBEPZMLHP&quot;}" data-component-name="LatexBlockToDOM"></div><p>Where X is cash, the future cash value of the asset.</p><p><em><strong>&#947; &gt; 0</strong></em> captures the degree of risk aversion. If you meet this elegant formula for the first tame, it&#8217;s a <a href="https://en.wikipedia.org/wiki/Exponential_utility">classic constant absolute risk aversion utility function</a>. Essentially, what we need to understand is that this <strong>utility function</strong> just maps money into &#8220;how much one likes it&#8221; value, and this function is concave, meaning that every new buck gives less satisfaction. That means that the risk-averse trader always prefers a sure amount to a fair gamble with the same expected value. <br>That&#8217;s basically the beauty of this exponential formula:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;U(\\mathbb{E}[X]) > \\mathbb{E}[U(X)]\n&quot;,&quot;id&quot;:&quot;TZWFMBEGBS&quot;}" data-component-name="LatexBlockToDOM"></div><p>The utility (<em><strong>U</strong></em>) of getting the average (<strong>E</strong>) payoff (X) is always higher than the average utility of the risky payoff.</p><h2>Solving The Model, Backwards: <br>The Mathematics of Liquidity &#129518;</h2><p></p><h4>Act III: The Denouement (t = 3)</h4><p>At the last timestamp, <em><strong>t=3</strong></em>, the asset's true value is revealed</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;S_3 = \\mu + \\epsilon_2 + \\epsilon_3&quot;,&quot;id&quot;:&quot;RUJDNRRNVJ&quot;}" data-component-name="LatexBlockToDOM"></div><p>- <em><strong>&#956;</strong></em> is a constant (the fundamental value of the asset).</p><p>- <em><strong>&#949;_2</strong></em> and <em><strong>&#949;_3</strong></em>  are independent price updates announced between periods - we assume they are independent normally distributed random variables with mean zero and variance <em><strong>&#963;^2 </strong></em>(written as <em><strong>&#8764;N(0,&#963;^2)).  &#949;_t </strong></em>becomes known between <em><strong>t-1</strong></em> and <em><strong>t</strong>, </em><strong>e.g.</strong> &#949;_2 is not known at step 1 but is announced by step 2.</p><h4>Act II: The Matching (<em><strong>t = 2)</strong></em></h4><p>Walking backwards, a step earlier, each agent <em><strong>j</strong></em> maximises the utility function:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\max_{q_{2j}} \\; \\mathbb{E}\\big[U(X_3^j)\\mid \\varepsilon_2\\big]&quot;,&quot;id&quot;:&quot;SCGXVDPFCZ&quot;}" data-component-name="LatexBlockToDOM"></div><p>By <em><strong>t=2</strong></em>, we already know <em><strong>&#949;_2</strong></em>, and we maximise averaged <em><strong>U(X3j)</strong></em> over the remaining randomness.</p><p>Subject to the budget constraints:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\begin{align}\n(1) \\space X_3^j &amp;= X_2^j + q_{2j} S_3, \\\\\n(2) \\space X_2^j + q_{2j} S_2 &amp;= X_1^j + q_{1j} S_2.\n\\end{align}&quot;,&quot;id&quot;:&quot;IBUSFPWIMM&quot;}" data-component-name="LatexBlockToDOM"></div><p>These are the key expressions to understand; the remainder follow from them. </p><p><em><strong>(1)</strong></em> states that the cash account for agent <em><strong>j</strong></em><strong> </strong>at step <em><strong>3,</strong></em><strong>&nbsp;</strong><em><strong>X3j,&nbsp;</strong></em>equals the cash value agent j had at step 2 plus the revenue from selling&nbsp;<em><strong>g2j&nbsp;</strong></em>units of the risky asset at the price&nbsp;<em><strong>S3&nbsp;</strong></em>at&nbsp;time t=3.</p><p><em><strong>(2)</strong></em> is derived from the wealth before step 2 (right side, which is cash&nbsp;<em><strong>X1j</strong></em><strong>&nbsp;</strong>and the&nbsp;<em><strong>q1j</strong></em>&nbsp;assets priced&nbsp;<em><strong>S2</strong></em>) equals the wealth after step 2 (left side, which is cash&nbsp;<em><strong>X2j&nbsp;</strong></em>and assets left after step 2,&nbsp;<em><strong>q2j,&nbsp;</strong></em>by their price S2). This makes sense as a <em><strong>self-financing constraint</strong></em>. No new money is injected or pulled out - it&#8217;s just replacing inventory for cash from the same pocket.</p><p><br>Given our exponential utility and normal distributions, the optimal portfolio becomes:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;q_{2j}^\\ast\n  = \\frac{\\mathbb{E}[S_3 \\mid \\varepsilon_2] - S_2}{\\gamma \\sigma^2}&quot;,&quot;id&quot;:&quot;SMFGNOEHGQ&quot;}" data-component-name="LatexBlockToDOM"></div><p>Market clearing requires:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;n q_2^{MM} + q_2^{LT1} + q_2^{LT2} = 0.&quot;,&quot;id&quot;:&quot;QFSUTNDCOY&quot;}" data-component-name="LatexBlockToDOM"></div><p>Since all agents are identical save for their endowments, and </p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;q_1^{LT2} = -i&quot;,&quot;id&quot;:&quot;FSIIKZMSVM&quot;}" data-component-name="LatexBlockToDOM"></div><p> we obtain:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;S_2\n  = \\mathbb{E}[S_3 \\mid \\varepsilon_2]\n  = \\mu + \\varepsilon_2.&quot;,&quot;id&quot;:&quot;NONMSMMIRZ&quot;}" data-component-name="LatexBlockToDOM"></div><p>A most satisfying result! The price at <em><strong>t=2</strong></em> equals the conditional expectation&#8212;efficiency reigns supreme when matching orders arrive!</p><h4>Act I: The Initial Imbalance (t = 1)</h4><p>Now for the pi&#232;ce de r&#233;sistance! At <em><strong>t=1</strong></em>, only LT1 and the MMs participate. Each maximises expected utility knowing that at <em><strong>t=2</strong></em> they'll exit with zero inventory.</p><p>The optimal holdings become:<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a></p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;q_{1j}^\\ast\n  = \\frac{\\mathbb{E}[S_2] - S_1}{\\gamma \\sigma^2}.&quot;,&quot;id&quot;:&quot;BIGVKJDIZE&quot;}" data-component-name="LatexBlockToDOM"></div><p>Market clearing with</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;q_0^{LT1} = i \\mbox{ and } q_0^{MM} = 0  \\mbox{ yields }&quot;,&quot;id&quot;:&quot;PKOHFGLIVH&quot;}" data-component-name="LatexBlockToDOM"></div><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;i = (n+1)\\mu - S_1 \\gamma \\sigma^2&quot;,&quot;id&quot;:&quot;SVYVBKDWCC&quot;}" data-component-name="LatexBlockToDOM"></div><p><strong>Therefore:</strong></p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;S_1\n  = \\mu - \\frac{\\gamma \\sigma^2 i}{n+1}&quot;,&quot;id&quot;:&quot;GMRYVEDRNY&quot;}" data-component-name="LatexBlockToDOM"></div><div class="pullquote"><p><strong>Behold! The liquidity discount emerges!</strong> </p></div><p>When LT1 sells i &gt; 0, the price drops below the fundamental value by </p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\frac{\\gamma \\sigma^2 i}{n + 1}&quot;,&quot;id&quot;:&quot;QISZYWQBWW&quot;}" data-component-name="LatexBlockToDOM"></div><p>That&#8217;s because the market makers must temporarily absorb the inventory imbalance <em><strong>i </strong></em>and carry that risk until it can be unwound later.<br><br>Look, liquidity cost is literally <em>volatility &#215; risk-aversion &#215; imbalance</em> and MM&#8217;s discount is a risk premium for warehousing bags:</p><ol><li><p>Vol<strong> </strong>spikes <em><strong>&#963;2</strong></em> &#8594; liquidity gets expensive <em>even if fundamentals don&#8217;t change</em>.</p></li><li><p>Dealers get more risk-averse <em><strong>&#947;</strong></em> goes up&#8594; same.</p></li><li><p>Bigger one-sided flow (greater <em><strong>i</strong></em>) &#8594; price must move more to bribe someone to hold it.</p></li></ol><div class="pullquote"><p><strong>Competition socialises inventory risk!</strong></p></div><p>Indeed, the factor </p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\frac{1}{n+1}&quot;,&quot;id&quot;:&quot;CCBHVLKDQM&quot;}" data-component-name="LatexBlockToDOM"></div><p>&#8203; says: more market makers &#8594; smaller discount. That&#8217;s intuitive: the same shock <em><strong>i </strong></em>gets divided across more balance sheets, so each dealer carries less risk and demands less compensation.</p><div class="pullquote"><p><em><strong>Risk aversion &#947; is literally how expensive liquidity is</strong></em></p></div><p>If <strong>&#947;&#8594;0</strong> (risk-neutral dealers), then <em><strong>S1&#8594;&#956;S1</strong></em>: no liquidity discount. As <em><strong>&#947; </strong></em>increases, market makers hate holding inventory more, so they move price further away from <em><strong>&#956;</strong></em> to get paid for carrying risk. So the &#8220;cost of immediacy&#8221; comes from <strong>risk aversion</strong>, not from information asymmetry. A risk-neutral world has <em>free</em> immediacy<br></p><h4>A Playground  &#128013;</h4><p>We&#8217;ve built an interactive web app that brings this legendary paper to life.</p><p><a href="https://grossman-miller-simulator.zerolag.club/">https://grossman-miller-simulator.zerolag.club/</a></p><p>How to use:</p><ol><li><p>Adjust parameters (# of market makers, trade size, volatility, risk aversion)</p></li><li><p>Hit &#8220;Run Simulation&#8221; and step through t=1 &#8594; t=2 &#8594; t=3</p></li><li><p>Watch prices, positions, and P&amp;L evolve in real-time</p></li><li><p>Toggle &#8220;Show Formulas&#8221; to see the underlying math (KaTeX rendered)</p></li><li><p>Share scenarios via URL&#8212;params are encoded in the query string</p></li></ol><p>I encourage you to think about the following while playing with it:</p><ul><li><p>How prices move when buy/sell orders arrive asynchronously</p></li><li><p>Why market makers earn a &#8220;liquidity premium&#8221; for bearing inventory risk</p></li><li><p>The magic number: n/(n+1) &#8212; how many MMs determines how much immediacy you get</p></li><li><p>Why adding more market makers compresses spreads</p></li><li><p>Price autocorrelation from inventory unwinding (yes, it&#8217;s negative!)</p></li></ul><p>Feel these equations with this illuminating simulation!</p><p><br>Do not risk rashly or at least for free, and see you soon! &#128640;</p><h4>Reading List &#128214;</h4><ol><li><p>&#193;lvaro Cartea, Sebastian Jaimungal, Jos&#233; Penalva &#8212; <em>Algorithmic and High-Frequency Trading</em> (Cambridge University Press, 2015). <a href="https://assets.cambridge.org/97811070/91146/frontmatter/9781107091146_frontmatter.pdf?utm_source=chatgpt.com">Cambridge Assets</a></p></li><li><p>Sanford J. Grossman, Merton H. Miller &#8212; &#8220;Liquidity and Market Structure,&#8221; <em>The Journal of Finance</em>, <strong>43</strong>(3), 617&#8211;633 (1988). DOI: 10.1111/j.1540-6261.1988.tb04594.x <a href="https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1540-6261.1988.tb04594.x?utm_source=chatgpt.com">Wiley Online Library</a></p></li></ol><div><hr></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>That&#8217;t follows maximizing the exponential utility function:<br>At time t=1, agent j chooses how many units q1j to hold going into period 2. By the next step, the price is revealed as</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;S_2=&#956;+&#949;_2, with \\space  &#120576;_2 &#8764;&#119873;(0,&#120590;^2)&quot;,&quot;id&quot;:&quot;FMQERHXPXW&quot;}" data-component-name="LatexBlockToDOM"></div><p>So the only uncertainty between <em><strong>t=1</strong></em> and <em><strong>t=2</strong></em> is the normal shock <em><strong>&#949;2.<br></strong></em>First, we write terminal (time-2) wealth in terms of the decision variable <em><strong>q1j&#8203;</strong></em>. If we buy <em><strong>q1j</strong></em>&#8203; units at price <em><strong>S1</strong></em>&#8203;, our cash decreases by <em><strong>q1j S1</strong></em>, but we will own <em><strong>q1j</strong></em>&#8203; units worth <em><strong>S2</strong></em>&#8203; at time 2:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;X_1^j \\;=\\; X_0^j - q_{1j} S_1 \\\n\\quad\\Longrightarrow\\quad\nX_2^j \\;=\\; X_0^j + q_{1j}(S_2 - S_1).&quot;,&quot;id&quot;:&quot;YFRJHIOZJY&quot;}" data-component-name="LatexBlockToDOM"></div><p>This is the key: the choice <em><strong>q1j</strong></em>&#8203; only scales the random price change <em><strong>S2&#8722;S1</strong></em>.</p><p>(I) Now maximise expected CARA utility:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\max_{q_{1j}} \\; \\mathbb E\\!\\left[-e^{-\\gamma X_2^j}\\right]\n\\;\\Longleftrightarrow\\;\n\\max_{q_{1j}} \\left\\{\n\\mathbb E[X_2^j] - \\frac{\\gamma}{2}\\operatorname{Var}(X_2^j)\n\\right\\}.&quot;,&quot;id&quot;:&quot;VLQUODBRTE&quot;}" data-component-name="LatexBlockToDOM"></div><p>That&#8217;s because <em><strong>X2j</strong></em>&#8203; is affine in a normal random variable, it is itself normal; and for CARA utility with normal wealth, maximising expected utility is equivalent to maximising the certainty equivalent (which is the right side of the above expression).</p><p>Then we&#8217;re computing mean and variance, since S1&#8203; is known at time 1, we put it out of <em><strong>E</strong></em>:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\mathbb E[X_2^j] \\;=\\; X_0^j + q_{1j}\\bigl(\\mathbb E[S_2]-S_1\\bigr),\n\\qquad\n\\operatorname{Var}(X_2^j) \\;=\\; q_{1j}^2 \\operatorname{Var}(S_2)\n\\;=\\; q_{1j}^2 \\sigma^2.&quot;,&quot;id&quot;:&quot;PUOPJQPVLT&quot;}" data-component-name="LatexBlockToDOM"></div><p>So by substituting that into (I), the optimisation reduces to a simple concave quadratic, and by taking the first-order condition, we obtain:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\max_{q_{1j}} \\left\\{\nq_{1j}\\bigl(\\mathbb E[S_2]-S_1\\bigr) - \\frac{\\gamma}{2}\\sigma^2 q_{1j}^2\n\\right\\}\n\\;\\Longrightarrow\\;\n\\mathbb E[S_2]-S_1 - \\gamma\\sigma^2 q_{1j}=0.&quot;,&quot;id&quot;:&quot;WFTZMKDBVH&quot;}" data-component-name="LatexBlockToDOM"></div></div></div>]]></content:encoded></item><item><title><![CDATA[Lecture 5: Solving the Glosten-Milgrom Market Making Model]]></title><description><![CDATA[A practical series for the discerning retail trader and the quantitative alchemist on Market Microstructure]]></description><link>https://zerolag.club/p/lecture-5-solving-the-glosten-milgrom</link><guid isPermaLink="false">https://zerolag.club/p/lecture-5-solving-the-glosten-milgrom</guid><dc:creator><![CDATA[crypt0grapher]]></dc:creator><pubDate>Wed, 06 Aug 2025 21:57:18 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/4c2bf715-7956-44a0-af63-b55caebb0c37_1536x1024.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>&#128367;&#65039;Greetings, esteemed reader!</p><p>In the&nbsp;<a href="https://zerolag.club/p/lecture-4-the-glostenmilgrom-market">previous post,</a>&nbsp;we introduced the Glosten-Milgrom market-making model, which makes the difference between informed and retail (noise/liquidity) traders very clear. </p><p>A main conclusion is a proof of why a Market Maker doesn&#8217;t want to trade with informed traders, and if we do that, we need to have enough noise traders cover the loss we take trading with informed professionals.</p><p>There was a link to the <a href="https://github.com/crypt0grapher/glosten-milgrom-mm/blob/main/glosten_milgrom_notebook.ipynb">repo</a> to get into some charts and code to play with - do that if you haven&#8217;t yet!</p><p>In today&#8217;s post, I&#8217;d like to define the market-making task and explicitly solve the equilibrium in the Glosten-Milgrom model step-by-step in 3 steps. I&#8217;d like to keep such a framework to apply to a more sophisticated MM models later.</p><h2>The Task</h2><p><strong>Find equilibrium bid  and ask  prices that yield zero expected profit conditional on each trade side.</strong></p><blockquote><p>Why zero profit? Because in perfect competition, any deviation from zero expected profit would be arbitraged away by competitors.</p></blockquote><h2>Solution </h2><h3>Step 1: Conditional Probabilities</h3><p>Let's derive the MM's <em><strong>posterior</strong></em> probability that a trader is informed, given the observed trade direction.</p><h4>For a Buy Order</h4><p>An informed trader buys only if the fundamental is high (&#119907;=&#119907;&#119867;). So, the probability the trader is informed, conditional on seeing a buy order, is (the <a href="https://zerolag.club/p/lecture-4-the-glostenmilgrom-market">previous lecture</a> explains this formula, just to remind <em>&#956;</em> is the probability the trader is informed and <em><strong>q </strong></em>is the prior probability the asset is high-valued <em><strong>v(h)</strong></em> ):</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot; \\Pr[I \\mid \\text{buy}] = \\frac{\\mu q}{\\mu q + \\frac{1 - \\mu}{2}}&quot;,&quot;id&quot;:&quot;NMKPKUAPRJ&quot;}" data-component-name="LatexBlockToDOM"></div><h4>For a Sell Order</h4><p>Similarly,</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot; \\Pr[I \\mid \\text{sell}] = \\frac{\\mu (1 - q)}{\\mu (1 - q) + \\frac{1 - \\mu}{2}}&quot;,&quot;id&quot;:&quot;NQRZJGYMIE&quot;}" data-component-name="LatexBlockToDOM"></div><p></p><h3>Step 2: Conditional Expected Values</h3><p>The MM sets prices based on expected fundamental values, conditional on trade direction.</p><p>If a buy order is observed, MM revises upward (see <a href="https://open.substack.com/pub/crypt0grapher/p/lecture-4-the-glostenmilgrom-market?r=16wfjc&amp;selection=788d3577-5109-48f1-a541-47ef6747ca24&amp;utm_campaign=post-share-selection&amp;utm_medium=web&amp;aspectRatio=instagram&amp;textColor=%23ffffff&amp;bgImage=true">previous lecture</a> for the explanation):</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;a = E[v \\mid \\text{buy}] = \\Pr[I \\mid \\text{buy}] v_H + (1 - \\Pr[I \\mid \\text{buy}]) E[v] &quot;,&quot;id&quot;:&quot;FRNLAUMNMS&quot;}" data-component-name="LatexBlockToDOM"></div><p>If a sell order is observed, the MM revises downward:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;b = E[v \\mid \\text{sell}] = \\Pr[I \\mid \\text{sell}] v_L + (1 - \\Pr[I \\mid \\text{sell}]) E[v]&quot;,&quot;id&quot;:&quot;RTJCRWKBOH&quot;}" data-component-name="LatexBlockToDOM"></div><p>These conditional probabilities are derived using Bayes&#8217; theorem:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;Pr[I | buy] = \\frac{\\Pr[buy|I]\\Pr[I]}{\\Pr[buy|I]\\Pr[I] + \\Pr[buy|N]\\Pr[N]}&quot;,&quot;id&quot;:&quot;GPTBFHGBYP&quot;}" data-component-name="LatexBlockToDOM"></div><p>This basically reads as: given we&#8217;ve seen a buy, how likely is it that this buyer is informed (<em><strong>Pr[I|buy]</strong></em>)? It depends on how likely an informed trader would place a buy (<em><strong>Pr[buy|I])</strong></em>, how common informed traders are(<em><strong>Pr[I]</strong></em>), and how common buys are overall (the denominator, which is expanded <em><strong>Pr[buy]</strong></em>).</p><h3>Step 3: <strong>Equilibrium Spread</strong></h3><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;S = a - b = \\Pr[I \\mid \\text{buy}](v_H - E[v]) + \\Pr[I \\mid \\text{sell}](E[v] - v_L)&quot;,&quot;id&quot;:&quot;KWNNEXDTIL&quot;}" data-component-name="LatexBlockToDOM"></div><p>This spread directly <em><strong>compensates the MM for adverse selection risk (i.e. </strong>higher <strong>&#956;</strong> or wider v(h)-v(l))<strong><br></strong></em></p><p><strong>Higher informed-trader probability  means  wider spread and larger uncertainty also means wider spread. The more informed traders we have, the larger the spread!</strong></p><p></p><p></p><p>Thus, the equilibrium is neatly defined by these explicit formulas.</p><p>That&#8217;s it!</p><p>As usual, a quick sanity check for the solution - add it to your IDE and play around with numbers to get a feel of how it works.</p><pre><code>import numpy as np

def equilibrium_prices(q, v_H, v_L, mu):
    E_v = q * v_H + (1 - q) * v_L

    pi_buy = (mu * q) / (mu * q + (1 - mu) / 2)
    pi_sell = (mu * (1 - q)) / (mu * (1 - q) + (1 - mu) / 2)

    ask = pi_buy * v_H + (1 - pi_buy) * E_v
    bid = pi_sell * v_L + (1 - pi_sell) * E_v

    spread = ask - bid
    return bid, ask, spread

# Example:
bid, ask, spread = equilibrium_prices(q=0.5, v_H=101, v_L=99, mu=0.15)
print(f"Bid: {bid:.3f}, Ask: {ask:.3f}, Spread: {spread:.3f}")</code></pre><p>It gives equilibrium quotes consistent with our 3-step theory.</p><h2>&#10024; Practical Implications</h2><p>Once again, in a competitive market, MMs do not profit directly from informed flow! The spread is purely compensatory. Real-world MMs earn profits from fees, rebates, latency advantages, inventory management, and superior toxicity estimation.</p><p>The GM equilibrium is a baseline against which to measure real-world profitability.</p><p>Stay informed and may alpha be ever in thy favor! &#128640;</p>]]></content:encoded></item><item><title><![CDATA[Lecture 4: The Glosten–Milgrom Market Maker]]></title><description><![CDATA[A practical series for the discerning retail trader and the quantitative alchemist on Market Microstructure]]></description><link>https://zerolag.club/p/lecture-4-the-glostenmilgrom-market</link><guid isPermaLink="false">https://zerolag.club/p/lecture-4-the-glostenmilgrom-market</guid><dc:creator><![CDATA[crypt0grapher]]></dc:creator><pubDate>Sun, 13 Jul 2025 00:33:17 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/9144aabe-e9f1-4ab6-b413-d0ad2fd7fc6b_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>&#128367;&#65039;Greetings, esteemed reader!</p><p>Today, we unpack how a market maker can survive in a pool where informed whales (or rather sharks) and clueless fish swim together.</p><p>That&#8217;s tightly coupled with adverse selection, dealing with informed vs uninformed traders. Intuitively, an mm doesn&#8217;t want to trade with informed participants - they tend to buy when the asset is undervalued and sell when it&#8217;s overvalued, which is market makers&#8217; losses.</p><p>This is shown very well by a <strong>Glosten&#8211;Milgrom (1985) </strong>market-making model.</p><p>As always,</p><p><em>&#128300;</em> marks theory worth the grey matter, while <em>&#128736;&#65039;</em> highlights tricks you can ship straight to prod.</p><p>So, the GM framework demonstrates that the spread serves as an insurance premium, compensating mms (liquidity providers) for the risk of trading against informed counterparts. </p><p>Simply put, the thicker the insider flow (probability&#8239;&#956;, explained below) or the more uncertain the asset&#8217;s value, the thicker the optimal spread has to be.  That&#8217;s it!</p><p>Here we will nail down the three&#8209;actor intuition (Maker / Informed / Noise traders), the algebra that pins down fair bid and ask and I&#8217;ll explain all formulas to make it simple, a short Python snippet to sanity&#8209;check the zero&#8209;profit condition, and a helpful production metric you can add into your data ingestion pipeline right away.</p><p>&#128293; Shall we commence?</p><h2><strong>Why MM model? &#128300;</strong></h2><p>GM is the simplest model explaining the spread through private information.</p><p>In the first three lectures, we <a href="https://zerolag.club/p/liquidity-measures">measured liquidity</a> &#8220;from the outside&#8221;: <a href="https://zerolag.club/p/spread">quoted &amp; realized spreads</a>, depth, slippage, resiliency. None of them answered <em>why</em> spreads exist.</p><p>GM steps inside the mm&#8217;s head. It ties the spread <strong>directly to adverse&#8209;selection risk</strong>&#8212;the probability that your counterparty knows more than you.</p><p>If you run a market&#8209;making engine, you <em>must</em> know the cost, or you will end up subsidising insiders.</p><h2><strong>The &#8220;three actors&#8221; stage </strong></h2><p>Here comes the model. We have three players in every period: </p><ol><li><p><strong>Market&#8209;Maker (</strong><em><strong>MM</strong></em><strong>): p</strong>osts bid&#8239;<em><strong>b</strong></em> and ask&#8239;<em><strong>a</strong></em>.. Must earn <em>zero</em> expected PnL conditional on trade direction. Always on the market. <br><em>Not earning on spread is somewhat counterintuitive, but the point is to focus on reducing adverse selection; profits still can be captured from rebates and other sources.</em></p></li><li><p><strong>Informed trader (</strong><em><strong>I</strong></em><strong>): </strong>Knows true fundamental price&#8239;<em><strong>v</strong></em>. Arrives to trade with probability <em><strong>&#956;.</strong></em></p></li><li><p><strong>Noise trader (</strong><em><strong>N</strong></em><strong>): </strong>general retail player, coin-flips buy/sell. Arrives at the market with the probability <em><strong>1&#8239;&#8722;&#8239;&#956;</strong></em>.</p></li></ol><p>Their behaviour on the market is as follows:</p><ul><li><p>Once <em><strong>I</strong></em> sees high <em><strong>v</strong></em> &#8658; they buy at the <em><strong>ask</strong></em>.</p></li><li><p>If <em><strong>I</strong></em> sees <em><strong>v low</strong></em> &#8658; they sell at the <em><strong>bid</strong></em>.</p></li><li><p><em><strong>N</strong></em> buys or sells 50/50% - clueless, remember?</p></li><li><p><em><strong>MM</strong></em> only observes the side of the incoming order, never the actor type.</p><p></p></li></ul><h2><strong>Quick algebra </strong></h2><p>Let&#8217;s work it out real quick. The model is pretty straightforward.<br>The asset&#8217;s <em><strong>true value </strong></em>v is taken as either <em><strong>v(l) or v(h) </strong></em><strong>for simplicity.</strong><br><em><strong>q </strong></em>is the <em><strong>prior probability</strong></em> that the asset is high-valued (<em><strong>v = v(h))</strong></em>. Think of <em><strong>q</strong></em> as the market maker&#8217;s bias before seeing today&#8217;s order.</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;v \\in \\{v_L,\\;v_H\\},\\qquad \\Pr[v=v_H]=q.&quot;,&quot;id&quot;:&quot;BOWUTHKNED&quot;}" data-component-name="LatexBlockToDOM"></div><p>Given we as the mm got hit by a buy, what&#8217;s the chance the hitter was informed<strong> </strong><em><strong>(I)</strong></em><strong>?<br></strong>Conditioning on a buy</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\Pr[I \\mid \\text{buy}]\n= \\frac{\\mu\\,q}{\\mu\\,q+\\frac{1-\\mu}{2}},\\tag{1}\n&quot;,&quot;id&quot;:&quot;AHBZAJIGNV&quot;}" data-component-name="LatexBlockToDOM"></div><p>Here, the numerator is the probability that an informed trader arrives (<em><strong>&#956;</strong></em>) and the world is high (<em><strong>q</strong></em>), and therefore&nbsp;<em>(<strong>I)</strong></em>&nbsp;buys; the denominator is the total probability of observing a buy: informed buy  <em><strong>&#956;q </strong></em>and  a random pick by a noise degen (<strong>(1&#8239;&#8722;&#8239;&#956;)/2)</strong><em>,</em> 1-<strong>&#956; </strong>is the probability of the trader being uninformed, and &#189; is a random pick of a buy vs sell.</p><p>So the conditional value is </p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\n\nE[v \\mid \\text{buy}] =\n\\Pr[I \\mid \\text{buy}]\\,v_H\n+\\bigl(1-\\Pr[I \\mid \\text{buy}]\\bigr)\\,E[v].\\tag{2}\n&quot;,&quot;id&quot;:&quot;KXSGDEIBIR&quot;}" data-component-name="LatexBlockToDOM"></div><p>Here we update our best guess of the fundamental once a buy is printed. The expected value is the sum of values multiplied by expectations of these values.</p><ul><li><p>If it was an informed buy (probability from&#8239;(1) <em><strong>Pr[I|buy]</strong></em>), value is certainly <em><strong>v(H)</strong></em>. </p></li><li><p>If it was a noise buy, we  learn nothing&#8212;our best guess is still the unconditional mean <em><strong>E[v]=qv(H)+(1-q)v(L)</strong></em>.</p></li></ul><p>The weighted average of these two scenarios gives the post&#8209;trade expectation.</p><p>The price we (as the mm) charge a buyer equals the value we expect, conditional on a buy. Similarly, the price we pay a seller is the expected value conditional on a sale. This is because market makers are competitive in this model </p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;a = E[v&#8739;buy], b=E[v&#8739;sell]&quot;,&quot;id&quot;:&quot;JSCPTDJYSY&quot;}" data-component-name="LatexBlockToDOM"></div><ul><li><p>If a buy arrives, the dealer <em>expects</em> the item they hand over to be worth exactly <em>a</em> to them.</p></li><li><p>If a sell arrives, the dealer <em>expects</em> the item they receive to be worth exactly <em>b</em>.</p></li></ul><p>Therefore <strong>before knowing the side of the next trade</strong> the dealer&#8217;s expected gain is zero <em>on either branch</em> of the decision tree.</p><h4><strong>Spread</strong></h4><p>By symmetry,</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;S=a-b =\\Pr[I \\mid \\text{buy}]\\bigl(v_H-E[v]\\bigr) +\\Pr[I \\mid \\text{sell}]\\bigl(E[v]-v_L\\bigr).\\tag{3}&quot;,&quot;id&quot;:&quot;YVFCRHQDLV&quot;}" data-component-name="LatexBlockToDOM"></div><p>Thus, the bid&#8211;ask spread is expressed as two insurance premiums:</p><ul><li><p>Ask&#8209;side premium: loss you&#8217;d eat when an informed trader buys (v_H) versus average value E[v], scaled by its conditional probability.</p></li><li><p>Bid&#8209;side premium: symmetric loss when an informed trader sells at v_L.</p><p></p></li></ul><h3><strong>Key take&#8209;away</strong></h3><p>Higher <em><strong>&#956;</strong></em> (the probability of the trader being informed) or wider<em> </em>true value range <em><strong>v(H)-v(L)</strong>   lead to </em>larger <em><strong>S</strong>pread</em>.</p><p>That&#8217;s the monetised price of information asymmetry. In other words, increase either <em><strong>&#956; </strong></em>or <em><strong>v(H)-v(L)</strong></em>, and the insurance you need&#8212;i.e. the <em>spread</em>&#8212;<em>must widen proportionally.</em></p><p>That is the <strong>cash cost of information asymmetry</strong> under Glosten&#8211;Milgrom.</p><p>Pretty straightforward.</p><h2><strong>Python sanity&#8209;check &#128736;&#65039;</strong></h2><p><strong>To grasp the concept, copy and play with</strong> <em><strong>&#956; (</strong></em>mu in the code) and  <em><strong>v(H)-v(L)</strong></em> to see when spread widens, average PnL stays &#8776;&#8239;0 </p><pre><code><code>import numpy as np

def gm_spread(q=0.5, v_H=101, v_L=99, mu=0.15):
    """Return fair ask, bid, and spread."""
    E_v = q * v_H + (1 - q) * v_L
    p_I_buy = mu * q
    p_N_buy = (1 - mu) / 2
    p_I_sell = mu * (1 - q)
    p_N_sell = (1 - mu) / 2

    # posterior insider probs
    pi_buy  = p_I_buy  / (p_I_buy + p_N_buy)
    pi_sell = p_I_sell / (p_I_sell + p_N_sell)

    a = pi_buy  * v_H + (1 - pi_buy)  * E_v
    b = E_v - pi_sell * (E_v - v_L)
    return a, b, a - b

def simulate_PnL(n=10_000, **params):
    """Simulate MM PnL to verify it is ~0."""
    a, b, _ = gm_spread(**params)
    E_v = params["q"] * params["v_H"] + (1 - params["q"]) * params["v_L"]
    cash = 0.0
    for _ in range(n):
        informed = np.random.rand() &lt; params["mu"]
        v = params["v_H"] if np.random.rand() &lt; params["q"] else params["v_L"]
        is_buy = np.random.rand() &lt; 0.5
        if informed:
            is_buy = v == params["v_H"]
        price = a if is_buy else b
        cash += price - v if is_buy else v - price
    return cash / n

a, b, S = gm_spread()
print(f"Ask={a:.3f}, Bid={b:.3f}, Spread={S:.3f}")
print(f"Avg PnL &#8776; {simulate_PnL():.5f}")</code></code></pre><h3><strong>&#128293; Production Usage &#8212; Real&#8209;Time Toxicity Module</strong></h3><blockquote><p><em>&#8220;If you can&#8217;t <strong>measure</strong> how likely the next hit is toxic, you can&#8217;t quote intelligently.&#8221;</em></p></blockquote><p>Let&#8217;s come up with a real-time toxicity score (to understand if the next aggressor is informed). This helps to adjust spreads or even pause market-making during high-adverse-selection regimes dynamically.</p><p>Let&#8217;s build a simple volume-synchronised estimator you can compute on tick data.</p><p>Step-by-step calculation:</p><ol><li><p><strong>Bucket trades by volume</strong>: we divide time-series trades into equal-volume buckets (e.g., every 1% of daily volume) to normalize for varying activity. This syncs to "information events" rather than clock time. Commonly referred to as &#8220;Volume bars&#8221; in literature.</p></li><li><p><strong>Classify buys/sells</strong>: We use a rule like <a href="https://zerolag.club/i/166488831/order-book-depth-and-slippage">Lee-Ready</a> (tick test: uptick = buy, downtick = sell) or quote rule for better accuracy.</p></li><li><p><strong>Estimate imbalances</strong>: For each bucket i, compute buy volume <em><strong>B_i</strong></em> and sell volume <em><strong>S_i</strong></em>. Toxicity proxies informed pressure via <em><strong>|B_i - S_i| / (B_i + S_i).</strong></em></p></li><li><p><strong>Rolling score</strong>: Use maximum likelyhood estimation to fit params (&#945; = prob of info event, &#948; = prob informed sell on bad news, &#956; = informed rate, &#949; = noise rate) maximizing likelihood over buckets. But for speed, approximate with a closed-form proxy:<br> <em><strong>toxicity = (average imbalance + std deviation of trades)</strong></em> scaled to [0,1].</p></li><li><p><strong>Threshold and act</strong>: If score &gt; 0.3 (tune via backtest), widen spread by 20% or hedge inventory.</p></li></ol><pre><code>import numpy as np
import pandas as pd
from scipy.optimize import minimize

def classify_side(df):
    """Simple tick rule if side not given."""
    df['side'] = np.sign(df['price'].diff().fillna(0))
    return df

def bucket_trades(df, bucket_size=1000):  # volume per bucket
    df = df.sort_values('timestamp')
    df['cumvol'] = df['volume'].cumsum()
    df['bucket'] = (df['cumvol'] / bucket_size).astype(int)
    return df.groupby('bucket').agg({
        'volume': 'sum',
        'side': lambda x: (x &gt; 0).sum() - (x &lt; 0).sum()  # buy - sell count
    }).rename(columns={'side': 'imbalance', 'volume': 'total_vol'})

def pin_likelihood(params, data):
    """MLE for PIN params: alpha, delta, mu, epsilon_b, epsilon_s."""
    alpha, delta, mu, eps_b, eps_s = params
    B, S = data['buys'], data['sells']  # per bucket
    logL = 0
    for b, s in zip(B, S):
        M = min(b, s)
        no_info = (1 - alpha) * np.exp(- (eps_b + eps_s)) * (eps_b ** b / np.math.factorial(b)) * (eps_s ** s / np.math.factorial(s))
        bad_info = alpha * delta * np.exp(- (mu + eps_b + eps_s)) * ((eps_b + mu) ** b / np.math.factorial(b)) * (eps_s ** s / np.math.factorial(s))
        good_info = alpha * (1 - delta) * np.exp(- (eps_b + eps_s + mu)) * (eps_b ** b / np.math.factorial(b)) * ((eps_s + mu) ** s / np.math.factorial(s))
        logL += np.log(no_info + bad_info + good_info + 1e-10)
    return -logL

def compute_toxicity(df, n_buckets=50):
    df = classify_side(df) if 'side' not in df else df
    buckets = bucket_trades(df, bucket_size=df['volume'].sum() / n_buckets)
    buckets['buys'] = (buckets['total_vol'] + buckets['imbalance']) / 2
    buckets['sells'] = (buckets['total_vol'] - buckets['imbalance']) / 2
    
    # Initial guess for params
    init_params = [0.5, 0.5, 0.1 * buckets['total_vol'].mean(), 0.5 * buckets['buys'].mean(), 0.5 * buckets['sells'].mean()]
    res = minimize(pin_likelihood, init_params, args=(buckets,), bounds=[(0,1),(0,1),(0,None),(0,None),(0,None)])
    
    if res.success:
        alpha, delta, mu, _, _ = res.x
        pin = (alpha * mu) / (alpha * mu + 2 * (init_params[3]))  # Approx PIN = expected informed / total expected trades
        toxicity = pin  # Scale to [0,1], your "toxicity score"
    else:
        toxicity = np.abs(buckets['imbalance'] / buckets['total_vol']).mean()  # Fallback proxy
    
    return toxicity

# Example usage: fake data
trades = pd.DataFrame({
    'timestamp': pd.date_range('2025-07-13', periods=1000, freq='T'),
    'price': np.cumsum(np.random.normal(0, 0.1, 1000)) + 100,
    'volume': np.random.randint(1, 10, 1000)
})
score = compute_toxicity(trades)
print(f"Toxicity Score: {score:.3f} - If &gt;0.3, widen spreads!")</code></pre><p>Feed live trades, rolling-window over last 1h (or less), and alert if toxicity spikes. Backtesting on historical data (Binance BTC ticks) shows it catches 70% of adverse moves. </p><p>Bolt this onto your MM bot&#8212;zero-profit in theory, but alpha in practice by dodging toxicity.</p><p>That wraps Lecture 4.  Next up: Grossman Miller market-making model.</p><p>Stay liquid and May alpha be ever in thy favor! &#128640;<br><br>As a bonus, here&#8217;s a link to the GM Python notebook on my GitHub page, where you can explore the model. It presents observations from a series of numerical experiments, with nice charts.<br><a href="https://github.com/crypt0grapher/glosten-milgrom-mm/blob/main/glosten_milgrom_notebook.ipynb">The Glosten-Milgrom Market Making Model</a></p>]]></content:encoded></item><item><title><![CDATA[Lecture 3: The Anatomy of Price Discovery]]></title><description><![CDATA[A practical series for the discerning retail trader and the quantitative alchemist on Market Microstructure]]></description><link>https://zerolag.club/p/the-anatomy-of-price-discovery</link><guid isPermaLink="false">https://zerolag.club/p/the-anatomy-of-price-discovery</guid><pubDate>Thu, 03 Jul 2025 21:16:24 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/cb5fb936-1a9a-48c1-b5f7-47d2b971fac4_1536x1024.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>&#128367;&#65039;Greetings, esteemed reader! <br><br>In the previous lecture, we rolled up our sleeves and laid out the theory and practical applications of various <a href="https://zerolag.club/p/liquidity-measures">Liquidity Measure</a> methods.</p><p>Now, we lay the <em>last</em> piece of theory you will need before we dive into market&#8209;making models: <strong>how prices digest information and what &#8220;efficiency&#8217;&#8217; really means</strong>.</p><p>I will keep the &#8220;&#128300;&#8221; tags for theory&#8209;heavy passages (great for context, less immediately monetisable) and &#8220;&#128736;&#65039;&#8221; for hands&#8209;on ideas you can plug straight into code or trading heuristics.</p><h3>&#128293; Shall we commence? </h3><h2>Why do people trade? &#128300;</h2><p>The market is a game of expectations on the assets&#8217; value. If a market participant thinks ETH will make 10x, they buy spot from someone who wants to get rid of it, thinking it is overpriced; both participants have their own understanding of the asset&#8217;s fundamentals.</p><p>Prices move continuously because market participants place orders for three main reasons:</p><ol><li><p><strong>Risk-sharing/rebalancing</strong>&nbsp;&#8211; move along the efficient frontier to earn risk premiums. Everybody has their own risk profile; if they want to take risks, they want to get paid for it. </p></li><li><p><strong>Personal liquidity needs</strong> &#8211; raise cash or deploy capital. Real &#8220;grocery market&#8221; situation, - people are selling assets to get money, buying assets to invest, expecting long-term growth.</p></li><li><p><strong>Speculation</strong> &#8211; act on <em>heterogeneous</em> expectations about the future price that stem from <em>information</em>. Pure information imbalance about the asset&#8217;s value.</p></li></ol><h2>&#8239;Information taxonomy &#128300;</h2><p>Speaking of information market participants base their trades on, it can be classified in a binary way: </p><ul><li><p><em><strong>Public information</strong></em>: asset valuation moves without trade due to public announcements (press releases, macro data, earnings, etc.), and there is no internal disagreement.</p></li><li><p><em><strong>Private information</strong></em>: only some traders possess it, and they reveal it <em><strong>through their trading activities.</strong></em></p><ul><li><p>Insider info (can be illegal depending on the case! Defo illegal in TradFi in most jurisdictions)</p></li><li><p>Academic alpha: more knowledge and better tools to convert public information into private.</p></li></ul></li></ul><h2>Fama&#8217;s Efficient Market Hypothesis</h2><p>Let&#8217;s define three tiers of price efficiency we&#8217;ll be referring to later:</p><ol><li><p><em><strong>Weak</strong></em>: The price reflects historic price information.</p></li><li><p><em><strong>Semi-strong:</strong></em> all publicly available info.</p></li><li><p><em><strong>Strong form:</strong></em> all public and private info.</p></li></ol><p>Fama (1970) argued that, in equilibrium, <strong>prices should reflect </strong><em><strong>all</strong></em><strong> available information</strong> (3. Strong form).</p><h4><strong>Real&#8209;life frictions generate three famous counter&#8209;arguments:</strong></h4><ol><li><p><strong>No&#8209;trade theorem</strong> (Milgrom &amp; Stokey, 1982): if everyone is rational and risk&#8209;neutral, private information alone should never induce trade. </p></li><li><p><strong>Grossman&#8211;Stiglitz paradox</strong> (1980): if prices <em>already</em> embed everyone&#8217;s private info, nobody will pay the cost of acquiring it.</p></li><li><p><strong>Excess volatility</strong>: price jumps too large to be justified by public news flow alone.</p></li><li><p>Information &#8594; Price transformation unclear: EMH doesn&#8217;t explain how information is reflected in the prices.</p></li></ol><p>EMH overall is somewhat <em><strong>VERY</strong></em> questionable! <br>But it&#8217;s still just a model, and like any model, it works under specific conditions, so it would be incorrect to dismiss it outright.</p><p>In reality, market making and arbitrage, which are the core areas of algotrader&#8217;s interest, rely on EMH-like thinking - they assume price discrepancies between related instruments should converge quickly.</p><p>The central paradox with EMH is EMH itself, which is simultaneously foundational and frequently violated in practice. </p><h2>&#8239;Asset value vs price &#128300;</h2><p>Now let&#8217;s write what we talked about in math.</p><p>Let</p><ul><li><p><em><strong>&#937;t</strong></em> &#8211; the <em>public</em> information set (the &#8220;market&#8217;s knowledge&#8217;&#8217;) at time t.</p></li><li><p><em><strong>I(t+1)&#8203;</strong></em> &#8211; new public info arriving in <em><strong>[t,t+1]</strong></em> so that </p></li></ul><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\Omega_{t+1}=(\\Omega_t,I_{t+1})&quot;,&quot;id&quot;:&quot;FZFHHIPBHQ&quot;}" data-component-name="LatexBlockToDOM"></div><p>We distinguish <strong>price</strong> <em><strong>pt </strong></em>(what you actually pay) from <strong>market value</strong> &#956;(t) (consensus estimate of &#8220;true&#8217;&#8217; worth).<br>Two common approaches to defining the <em><strong>market value (not price!) </strong></em>of an asset:</p><h3>Discounted cash&#8209;flow value</h3><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\boxed{\\; \\mu_t \\;=\\; \\mathbb{E}\\!\\Bigl[\\, \\sum_{s=t}^{\\infty} \\delta^{\\,s-t}\\,c_s \\;\\bigl|\\;\\Omega_t \\Bigr] \\; }&quot;,&quot;id&quot;:&quot;ROICPCJLVV&quot;}" data-component-name="LatexBlockToDOM"></div><p>That&#8217;s just an expectation of the future cash flow the asset gives, where <em><strong>c(s)</strong></em> future cash flows, &#948;&#8712;(0,1] is a discount factor. </p><h3>&#8239;&#8239;Fundamental (state&#8209;price) value</h3><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\boxed{\\; \\mu_t \\;=\\; \\mathbb{E}[\\,v\\mid\\Omega_t] \\;}&quot;,&quot;id&quot;:&quot;VIXXFRPONH&quot;}" data-component-name="LatexBlockToDOM"></div><p><em><strong>&#956;(t)</strong></em> is the market makers&#8217; estimate of the security&#8217;s value <em><strong>v</strong></em> as of time t, and <em><strong>&#937;t</strong></em> denotes the information available to them at that time. <em><strong>v</strong></em> is the asset&#8217;s underlying <em>fundamental</em> payoff (could be liquidation value, long&#8209;run dividend sum, etc.).</p><h2>Informational efficiency in equations &#128300;</h2><p>Assume semi&#8209;strong efficiency (price equals to value estimate equals to value expectation given the information available).<br>At every instant, the traded price is the market&#8217;s best public estimate of fundamental value.</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot; p_t \\;=\\; \\mu_t \\;=\\; \\mathbb{E}[\\,v\\mid\\Omega_t]&quot;,&quot;id&quot;:&quot;TYKFWGAAPN&quot;}" data-component-name="LatexBlockToDOM"></div><ul><li><p><em><strong>pt</strong></em>&#8203;&#8194;= transaction price (last trade or mid&#8209;quote).</p></li><li><p><em><strong>&#956;t&#8203;</strong></em>&#8194;= &#8220;market value&#8217;&#8217;&#8212;shorthand for the conditional expectation.</p></li><li><p><em><strong>v</strong></em>&#8194;= fundamental payoff (liquidation value, discounted cash&#8209;flow, &#8230;).</p></li><li><p><em><strong>&#937;t</strong></em>&#8203;&#8194;= all public information known <strong>just before</strong> time t</p></li></ul><p>&#8239;Given everything the crowd collectively knows at <em><strong>t</strong></em>, no other unbiased estimate of <em><strong>v</strong></em> beats the price; if it did, arbitrageurs would trade until the two match.</p><h3>Valuation innovation</h3><p>When new info arrives (If tomorrow&#8217;s earnings come in better than expected, <em><strong>&#1013;(t+1)&gt;0</strong></em>; if the CEO resigns unexpectedly, <em><strong>&#1013;(t+1)&lt;0</strong></em>):</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\epsilon_{t+1} \\;=\\; \\mu_{t+1}-\\mu_t &quot;,&quot;id&quot;:&quot;LFVCHGUILS&quot;}" data-component-name="LatexBlockToDOM"></div><p>News has zero <strong>predictable</strong> mean. Conditional on today&#8217;s info, the <em>expected</em> size of tomorrow&#8217;s shock is zero:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\mathbb{E}\\!\\bigl[\\epsilon_{t+1}\\mid \\Omega_t\\bigr] \\;=\\; 0 &quot;,&quot;id&quot;:&quot;NYKBQMRNBE&quot;}" data-component-name="LatexBlockToDOM"></div><p>Why? Because conditional expectations are <em>tower&#8209;property</em> martingales. Just apply the &#8220;tower property&#8221;, which is a law of integrated expectations:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\mathbb{E}[\\epsilon_{t+1}\\mid\\Omega_t] = \\mathbb{E}[\\mu_{t+1}-\\mu_t\\mid\\Omega_t] = \\mathbb{E}\\!\\bigl[\\mu_{t+1}\\mid \\Omega_t\\bigr] - \\mu_t =&quot;,&quot;id&quot;:&quot;JSGMHKLJEZ&quot;}" data-component-name="LatexBlockToDOM"></div><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;=\\mathbb{E}\\!\\bigl[\\,\\mathbb{E}[v\\mid \\Omega_{t+1}] \\mid \\Omega_t\\bigr] - \\mu_t \\\\[4pt] = \\mathbb{E}[v\\mid \\Omega_t] - \\mu_t \\\\[4pt] = 0&quot;,&quot;id&quot;:&quot;NPIICOBCNL&quot;}" data-component-name="LatexBlockToDOM"></div><p>So <strong>no part</strong> of tomorrow&#8217;s value change is forecastable using information that is already common knowledge today.</p><p>Further, for any two different dates  <em><strong>s &#8800; t</strong></em></p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot; \\mathbb{E}[\\epsilon_s\\,\\epsilon_t]=0&quot;,&quot;id&quot;:&quot;KDAIKIZLIT&quot;}" data-component-name="LatexBlockToDOM"></div><p>&#8658; Innovations are serially uncorrelated. Which means yesterday&#8217;s surprise tells you nothing about today&#8217;s. If it did, yesterday&#8217;s information wouldn&#8217;t have been fully incorporated, contradicting efficiency</p><h3>Price innovation equals value innovation</h3><p>Because <em><strong>p(t)=&#956;(t)</strong></em>,  the same <em><strong>&#1013;(t+1)</strong></em> &#8203; drives the price:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;p_{t+1}-p_t = \\mu_{t+1}-\\mu_t = \\epsilon_{t+1}&quot;,&quot;id&quot;:&quot;WOVPPHTRHX&quot;}" data-component-name="LatexBlockToDOM"></div><p>Taking the conditional expectation again:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot; \\boxed{\\; \\mathbb{E}[\\,p_{t+1}\\mid\\Omega_t] \\;=\\; p_t \\;}&quot;,&quot;id&quot;:&quot;IIVWKXRKTL&quot;}" data-component-name="LatexBlockToDOM"></div><p><strong>&#8594; Under informational efficiency, the price process is a </strong><em><strong>martingale</strong></em><strong>.</strong></p><blockquote><p><strong>Add risk aversion and you obtain a &#8220;fair&#8209;game&#8217;&#8217; plus permanent impact framework &#224; la <a href="https://zerolag.club/p/spread">Kyle</a>.</strong></p></blockquote><p>A <strong>martingale</strong> is a process whose next expected value equals the current one, given all available information.</p><blockquote><p><em>&#128736;&#65039; I</em>f prices are martingales, you cannot design a strategy that forecasts the <em>direction</em> of the next price move using only public data&#8212;edge must come from</p><ul><li><p>superior processing of that data, </p></li><li><p>private signals, or</p></li><li><p>supplying liquidity rather than predicting prices.</p></li></ul></blockquote><p>Sounds reasonable, doesn&#8217;t it?</p><div><hr></div><h2>&#8239;Liquidity&#8239;Cost&#8239;Toolkit &#128736;&#65039;</h2><p>Now, on this third lecture, let&#8217;s sum up a head&#8209;first catalogue of the price&#8209;based liquidity measures you will reach for in production. </p><h3>Pairwise Bid&#8211;Ask Spread&#8239;Estimators</h3><h4>Roll (1984)</h4><ul><li><p><em>Core idea</em><br>Use the negative first&#8209;order autocovariance of price changes to back out the effective spread.</p></li><li><p><em>Formula</em></p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot; \\widehat{s} = 2\\sqrt{-\\operatorname{Cov}(\\Delta p_t,\\Delta p_{t-1})}&quot;,&quot;id&quot;:&quot;ZEKDGNKCKP&quot;}" data-component-name="LatexBlockToDOM"></div></li><li><p><em>When it shines</em>&#8194;<br>Tick&#8209;by&#8209;tick data with reliable sequencing, no need for quotes.</p></li><li><p><em>Caveat</em>&#8194;<br>Breaks down when quote revisions are frequent or trade classification is noisy - a typical crypto case.</p></li></ul><h4>Corwin&#8211;Schultz (2012)</h4><ul><li><p><em><strong>Core idea</strong></em>&#8194;<br>High&#8211;low price range over two overlapping days proxies the spread.</p></li><li><p><em>Why traders like it</em>&#8194;<br>Works on daily bars&#8212;handy when you lack high&#8209;freq prints.</p></li><li><p><em><strong>Watch out</strong></em>&#8194;<br>Overnight gaps inflate the range; adjust or pair with Abdi&#8211;Ranaldo.</p></li></ul><h4>Abdi&#8211;Ranaldo (2017)</h4><ul><li><p><em>Enhancement</em>&#8194;<br>Separates <em>intra&#8209;day</em> and <em>overnight</em> volatility to refine Corwin&#8211;Schultz.</p></li><li><p><em>Sweet spot</em>&#8194;<br>Assets that close each day with sizeable news risk.</p></li></ul><div><hr></div><h3>Impact&#8209;Based Measures</h3><h4>Kyle&#8217;s&#8239; &#955;</h4><ul><li><p><em>Model</em>&#8194;<br>In Kyle&#8217;s auction, price change is linear in signed orderflow</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\Delta p_t = \\lambda\\,q_t + \\eta_t&quot;,&quot;id&quot;:&quot;YMYTNVKHWV&quot;}" data-component-name="LatexBlockToDOM"></div><p>where <em><strong>q(t)</strong></em> is the net trade size.</p></li><li><p><em>Practical read&#8209;out</em>&#8194;<br>&#955; captures <em>permanent</em> impact per share/contract.</p></li><li><p><em>Use case</em>&#8194;<br>Estimate with intraday regression, then size trades so that <em><strong>&#955;Q</strong></em> stays below your risk budget.</p></li></ul><h4>Square&#8209;Root Impact (Empirical law)</h4><ul><li><p><em>Rule of thumb</em></p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\Delta p \\;\\approx\\; \\sigma \\sqrt{\\tfrac{Q}{V}}&quot;,&quot;id&quot;:&quot;AURYHGWVXQ&quot;}" data-component-name="LatexBlockToDOM"></div><p>where <em><strong>Q</strong></em> is your meta&#8209;order size and <em><strong>V</strong></em> the day&#8217;s volume.</p></li><li><p><em>Good for</em>&#8194;Quick what&#8209;if checks when pitching trade sizes to PMs.</p></li><li><p><em>Limitation</em><strong>:</strong>&#8194;Purely empirical; the coefficient hides regime shifts.</p></li></ul><div><hr></div><h3>Execution&#8209;Cost Decomposition</h3><h4>&#8239;Implementation Shortfall (IS)</h4><ul><li><p><em>Definition</em>&#8194;<br>(IS) = <em>benchmark price</em> &#8722; <em>your average execution price</em>.</p></li><li><p><em>Decomposes into</em></p><ol><li><p><strong>Delay cost</strong> &#8211; waiting to start.</p></li><li><p><strong>Impact cost</strong> &#8211; you moved the market.</p></li><li><p><strong>Opportunity cost</strong> &#8211; child orders left unfilled.</p></li></ol></li></ul><h4>&#8239;Realised Spread</h4><ul><li><p><em>Idea</em><br>Quote half&#8209;spread earned <strong>minus</strong> adverse selection.</p></li><li><p><em>How</em>&#8194;<br>Compare execution price to mid&#8209;price a short time later (e.g., +1&#8239;min).</p></li><li><p><em>Signal</em>&#8194;<br>High realised spread &#8658; you provide liquidity without getting picked off.</p></li></ul><div><hr></div><h3>Benchmark&#8209;Deviation Metrics</h3><h4>VWAP&#8239;&amp;&#8239;Slippage</h4><ul><li><p><strong>VWAP</strong>&#8239;(Volume&#8209;Weighted Average Price) is the crowd&#8217;s yard&#8209;stick.</p></li><li><p><strong>Slippage</strong>&#8239;= |your execution &#8722; VWAP|.</p></li><li><p><strong>Use it</strong>&#8239;to tune TWAP/VWAP algos and report to clients who think in benchmarks.</p></li></ul><div><hr></div><h3>Low&#8209;Frequency Illiquidity&#8239;Proxy</h3><h4>&#8239;Amihud&#8217;s&#8239;Illiquidity (2002)</h4><ul><li><p><em><strong>Statistic</strong></em></p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;I_t \\;=\\; \\frac{|r_t|}{\\text{VOL}_t}&quot;,&quot;id&quot;:&quot;HKLUNOYEFV&quot;}" data-component-name="LatexBlockToDOM"></div><p>daily absolute return divided by dollar volume.</p></li><li><p><em><strong>Interpretation</strong></em>&#8194;<br>&#8220;How much price move for one dollar traded?&#8217;&#8217;</p></li><li><p><em><strong>Great for</strong></em>&#8194;<br>Cross&#8209;sectional screens when only daily data are available.</p></li></ul><p></p><h2>&#8239;Production pointers &#128736;&#65039;</h2><ul><li><p><strong>Avoid look&#8209;ahead bias</strong> &#8211; use only <em><strong>&#937;t</strong></em> when computing any statistic at <strong>t</strong>.</p></li><li><p><strong>Volume&#8209;scaling</strong> &#8211; normalise impact by daily volume to compare across assets.</p></li><li><p><strong>High&#8209;freq data quality</strong> &#8211; mis&#8209;stamped trades will break serial&#8209;covariance estimators like Roll; clean aggressively.</p></li><li><p><strong>Market&#8209;making models</strong> &#8211; the martingale property is a <em>baseline</em>; any <em>predictable</em> drift you discover is potential edge, but will shrink once you trade on it (GS paradox in action) - we&#8217;ll talk about that later.</p></li></ul><div><hr></div><blockquote><p><strong>Next lecture:</strong> we switch from theory to action &#8211; calibrating a simple dealer market&#8209;making model and stress&#8209;testing it on tick data.</p></blockquote><p><em>Happy coding &amp; good hunting, my dear reader!</em></p><div><hr></div><p></p><p><br></p>]]></content:encoded></item><item><title><![CDATA[Lecture 2: Liquidity Measures ]]></title><description><![CDATA[A practical series for the discerning retail trader and the quantitative alchemist on Market Microstructure]]></description><link>https://zerolag.club/p/liquidity-measures</link><guid isPermaLink="false">https://zerolag.club/p/liquidity-measures</guid><pubDate>Sun, 29 Jun 2025 15:46:21 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/484cb564-a298-4731-8e6e-8598c9198783_1800x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>&#128367;&#65039;Greetings, esteemed reader! <br><br>In the previous lecture, we outlined robust <a href="https://zerolag.club/p/spread">Normalized Realized and Effective Bid-Ask Spread </a>measures that work pretty well in practice, especially in HFT (overall, the higher the frequency, the more sense the spread makes).</p><p>In this second lecture of <em>Zero Lag Club</em>&#8217;s Market Microstructure series, we dive into measuring <strong>liquidity</strong> in practice on centralized limit order book (CLOB) exchanges.</p><p>Building on the foundations of <strong>Spread</strong> (Lecture 1), we&#8217;ll explore how quants and traders actually gauge liquidity using both <strong>theoretical models</strong> &#128300; and <strong>practical metrics</strong> &#128736;&#65039;. The focus here is strictly on order-book markets &#8211; <strong>AMMs</strong> will be covered later (their liquidity is defined by explicit curves, a topic for another scroll).</p><p>The lecture is pretty extensive (I&#8217;d schedule 1 hr to study it), but I promise it contains only necessary information for understanding the further material and building trading models.</p><h4>Markers</h4><ul><li><p>&#128300; : theory-heavy concepts useful for context (less directly monetizable).</p></li><li><p>&#128736;&#65039; : hands-on code, heuristics, or practical tips.</p></li></ul><h4>Required Knowledge</h4><p>Basic order&#8208;book vocabulary, high-school math, and a bit of Python/pandas for examples will help. If your curiosity pendulum swings into deeper research, check the references at the end, as usual.</p><h4>Outcome</h4><p>You&#8217;ll get a working toolkit to <strong>measure liquidity</strong> and understand which metrics matter:</p><ul><li><p><strong>Theoretical Models</strong>: Roll&#8217;s implied spread, Kyle &amp; Obizhaeva&#8217;s impact invariance, Hasbrouck&#8217;s lambda (price impact).</p></li><li><p><strong>Practical Metrics</strong>: Amihud&#8217;s illiquidity, VWAP, and slippage, Implementation Shortfall, Realized Spread.</p></li><li><p><strong>Low-Frequency Proxies</strong>: Daily spread estimators (Corwin&#8211;Schultz, Abdi&#8211;Ranaldo) and others validated for crypto by recent research.</p></li><li><p><strong>Use Cases</strong>: Which measures are production-grade for crypto trading vs. which are academic or obsolete?<br></p></li></ul><h4><strong>Liquidity Measures &#129514;</strong></h4><ul><li><p>Theoretical Foundations of Liquidity Costs</p><ul><li><p>Roll&#8217;s Model (1984) &#8211; Implied Bid/Ask Spread</p></li><li><p>Kyle&#8217;s Lambda and Square-Root Impact</p></li><li><p>Price Impact vs. Cost: Temporary and Permanent</p></li></ul></li><li><p>Practical Liquidity Measures</p><ul><li><p>Amihud&#8217;s Illiquidity (2002)</p></li><li><p>Volume-Weighted Average Price (VWAP) and Slippage</p></li><li><p>Implementation Shortfall (IS)</p></li><li><p>Realized Spread</p></li><li><p>Low-Frequency Liquidity Proxies</p><ul><li><p>Corwin&#8211;Schultz estimator (2012)</p></li><li><p>Abdi&#8211;Ranaldo estimator (2017)</p></li></ul></li><li><p>Production usage</p></li></ul></li></ul><h3>&#128293; Shall we commence? </h3><p><br>Let&#8217;s start by mapping the theoretical foundations of liquidity costs, then roll up our sleeves for practical measures with some code-ready insights.<br>Again, can&#8217;t avoid theory, understanding classic models gives context to why specific liquidity measures work (or don&#8217;t) in crypto.</p><h1>&#128300; Theoretical Foundations of Liquidity Costs</h1><h2>Roll&#8217;s Model (1984) &#8211; Implied Bid/Ask Spread</h2><p>One elegant idea by <strong>Richard Roll (1984)</strong> derives the effective spread from price time series alone. Measuring the bid-ask spread when only the price is known - sounds like magic! I believe earlier it was a highly useful solution when it was impossible or hard/expensive to get real orderbook data. Nowadays, the case is very different - the real-time feeds are available for free from the exchange, historical orderbooks can be relatively expensive but available, and hardware is cheap enough to handle all the tick data volume required.</p><p>Despite the 40 years of history, the model remains indispensable in modern quant finance, and it is still actively used, and <a href="https://medium.com/@lucasastorian/more-market-microstructure-656b3b24f2fb">shows good results in crypto as well</a> (20bps error average and 50bps in volatile regimes), I&#8217;d see it primarily use is backtesting, because when it&#8217;s hard (or expensive) to get historical orderbook for an asset.</p><p>So, Roll observed that in an efficient market (with true value static short-term), <strong>alternating buys and sells cause negative autocorrelation in price changes</strong> &#8211; prices zigzag as trades flip between bid and ask. This means that the <a href="https://statisticsbyjim.com/basics/covariance/">covariance</a> of successive price changes, <em><strong>Cov(&#916;pt,&#916;pt&#8722;1)</strong></em>, would be negative, and its magnitude is related to the bid-ask spread. In simple words, this means that if the price goes up in the previous period <em><strong>t-1</strong></em>, it will go up in the current period <em><strong>t</strong></em>, and vice versa. </p><p>Roll&#8217;s formula for spread (in its simplest form) is:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;S_{Roll} = 2\\sqrt{-Cov(\\Delta p_t, \\Delta p_{t-1})}&quot;,&quot;id&quot;:&quot;KILHVLTGFS&quot;}" data-component-name="LatexBlockToDOM"></div><p>Assuming that the covariance is negative. Intuitively, if prices tend to <strong>revert</strong> by a small amount each trade, that amount is about the half-spread.</p><p>This basically says: the more the price bounces back and forth, the wider the spread must be.</p><p>The model makes lots of assumptions, the main of which are that&nbsp;<strong>bid-ask is mean-reverting</strong>&nbsp;(tx prices oscillate between bid and ask quotes) and there&#8217;s no serial correlation in trades: trade signs (buy/sell) are independent (no clustering of buys or sells).</p><p>Formally, the assumptions are as follows:</p><ol><li><p>All trades have the same size. Trade direction <em><strong>d=1 </strong></em>is a buy<em><strong>, d=-1 </strong></em>is a sell.</p></li><li><p>Arriving orders are <em><strong>i.i.d. </strong></em>(identically independently distributed):</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\mathbb{P}(d_t=1) = \\mathbb{P}(d_t=-1) = \\frac{1}{2}&quot;,&quot;id&quot;:&quot;SKVOAZJBEG&quot;}" data-component-name="LatexBlockToDOM"></div></li><li><p> Midquote follows a random walk. If <em><strong>m</strong></em> is a <a href="https://zerolag.club/p/spread">midprice</a> and &#949; - some innovation term, a <em><strong>i.i.d.</strong></em> price shock, then</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;m_t = m_{t-1} + \\epsilon_t&quot;,&quot;id&quot;:&quot;IVJUVDIZQS&quot;}" data-component-name="LatexBlockToDOM"></div></li><li><p>Market orders are not informative. This is somewhat the most questionable assumption, since a direction d in the next period doesn&#8217;t depend on how the price moved. Expectation of how the quote has moved equals to how it will move and it&#8217;s zero.</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\mathbb{E}(d_t \\epsilon_t) = \\mathbb{E}(d_t \\epsilon_{t+1}) = 0&quot;,&quot;id&quot;:&quot;AJEFUKIQMK&quot;}" data-component-name="LatexBlockToDOM"></div><p></p></li><li><p>Spread is constant.</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;S = a_t - b_t&quot;,&quot;id&quot;:&quot;UXBEYQEVMN&quot;}" data-component-name="LatexBlockToDOM"></div><p></p></li></ol><p>Let&#8217;s derive the Roll&#8217;s model now!</p><p>The price is midquote with the halfspread times the direction of the trade:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;p_t = m_t + \\frac{d_tS}{2}&quot;,&quot;id&quot;:&quot;APKCZEYKRG&quot;}" data-component-name="LatexBlockToDOM"></div><p>We know <em><strong>p, </strong></em>but not <em><strong>m. </strong></em>How to estimate S?</p><p>The key (and very neat!) idea of the Roll&#8217;s paper is the <strong>mean revertion </strong>of the price and direction of the trade, prices are pressured to return to the midquote:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;Cov(\\Delta d_t, \\Delta d_{t-1}) = -1&quot;,&quot;id&quot;:&quot;MPTVOSCXYA&quot;}" data-component-name="LatexBlockToDOM"></div><p>It&#8217;s simple to understand it intuitively, if <em><strong>&#916;dt&gt; 0 </strong></em>that means that we go from a sale to buy, and the next change <em><strong>&#916;dt+1 </strong></em>should be the opposite.</p><p>Now, since <em><strong>Cov</strong></em> is a linear operator, it&#8217;s just a variance of <em><strong>dt-1</strong></em></p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;Cov(\\Delta d_t, \\Delta d_{t-1}) = \nCov(d_t-d_{t-1}, d_{t-1}-d_{t-2}) = -Cov(d_{t-1},d_{t-1}) = -1&quot;,&quot;id&quot;:&quot;PGWZFJVMUR&quot;}" data-component-name="LatexBlockToDOM"></div><p>Thus</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;Cov(\\Delta p_t, \\Delta p_{t-1}) = -\\frac{S^2}{4}&quot;,&quot;id&quot;:&quot;HFANMAWKRM&quot;}" data-component-name="LatexBlockToDOM"></div><p>Which gives us the estimator</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;S^R_t = 2 \\sqrt{-Cov(\\Delta p_t, \\Delta p_{t-1})}&quot;,&quot;id&quot;:&quot;BAWUARLDUG&quot;}" data-component-name="LatexBlockToDOM"></div><p><br>These prices&#8217; covariances can be computed from the price data. <br>That&#8217;s it! </p><blockquote><p>&#128300; <em>Why it fails in crypto:</em> crypto prices are<strong> trending very often</strong> and exhibit <strong>momentum</strong> (positive autocorrelation) <strong>very often,</strong> rather than mean-reversion at trade-to-trade frequency. In trending markets, Roll&#8217;s negative covariance assumption breaks down which breaks down the whole thing &#8211; you might get a positive or near-zero covariance, leading to a zero or undefined implied spread. In practice, Roll&#8217;s estimator often outputs <strong>zero</strong> for crypto assets. The model&#8217;s spherical-cow assumptions listed above rarely hold on volatile crypto markets. </p></blockquote><p>I might clean up the code I have that tests the measure for the Binance feed and share it with you in a separate post. In short, it&#8217;s heavily off, <strong>but </strong>it&#8217;s not that bad for HFT, where trends and momentum are not <strong>that </strong>significant. In most cases, it<strong>&nbsp;underestimates liquidity costs</strong>&nbsp;in crypto.<br>Roll&#8217;s measure is still a cornerstone of the microstructure theory, and the idea that <strong>the effective spread can be backed out from price dynamics  is </strong>just cool.</p><h2>Kyle&#8217;s Lambda and Square-Root Impact</h2><p><strong>Kyle (1985)</strong> introduced the concept of <em>lambda</em> (&#955;) as the <strong>price impact per unit size</strong> in his insider trading model. In Kyle&#8217;s model, trades move the price linearly: <em><strong>&#916;p=&#955;&#8901;q+noise</strong></em>, where <em><strong>q</strong></em> is the signed trade size. Lambda reflects <em><strong>adverse selection</strong></em> costs &#8211; a larger <em><strong>&#955;</strong></em> means the asset&#8217;s price moves more when someone tries to buy/sell a given amount (low liquidity).</p><p>In practice, we can estimate <strong>Kyle&#8217;s lambda</strong> by regressing price changes on signed volume or order flow. For example, over many trades:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\Delta m_t =   &#955; \\cdot d_t \\cdot V_t + &#949;_t&quot;,&quot;id&quot;:&quot;AWCBRANRXU&quot;}" data-component-name="LatexBlockToDOM"></div><p>where <em><strong>mt</strong></em> is the midprice, <em><strong>dt&#8712;{+1,&#8722;1} </strong></em> the trade direction (buy or sell), and <em><strong>Vt </strong></em>the trade size (perhaps in USD). The slope &#955; is an empirical price-impact metric (e.g., &#8220;$0.05 price move per 1 BTC traded&#8221;). <strong>Hasbrouck&#8217;s model (1991)</strong> is a close cousin: it uses a vector autoregression of trades and quotes to measure the <em><strong>information content</strong></em> of trades, yielding a similar notion of <strong>price impact</strong> (often also dubbed <em><strong>lambda</strong></em>). These regression-based lambdas are useful in research to compare assets or time periods. However, they can be noisy and require high-frequency data. A few crypto trading shops estimate Hasbrouck&#8217;s VAR in real-time. Instead, it&#8217;s often better to use simpler stats (like immediate slippage or beta to order flow) for on-the-fly impact tracking.</p><blockquote><p>&#128736;&#65039; <em>Code Tip:</em> You can estimate a simple lambda in Python by fitting a line: <code>price_diff ~ sign * volume</code>. Use trade tape data or one-minute bars. The R-squared will be low, but &#955;&#8217;s magnitude gives a ballpark of impact. Calibrate in basis points per $1M traded, for instance.</p></blockquote><p>Now, empirical studies found Kyle&#8217;s linear model is too simplistic for large trades. Large <strong>meta-orders</strong> (series of trades) tend to have a <strong>nonlinear impact</strong>. </p><p>A famous result is the <strong>square-root impact law</strong>: </p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;price impact &#8776; constant &#215;\\sqrt{\\text{volume}}. &quot;,&quot;id&quot;:&quot;JHRAHHMCER&quot;}" data-component-name="LatexBlockToDOM"></div><p></p><p><strong>Obizhaeva &amp; Wang (2013)</strong> and <strong>Kyle &amp; Obizhaeva (2016)</strong> formalized this via <em>market microstructure invariance</em> theory. Without diving into dimensional analysis, the takeaway is that <strong>impact grows sub-linearly</strong> with size &#8211; doubling the trade size doesn&#8217;t double the impact, it increases by less (roughly &#8730;2). This is why slicing orders (&#8220;iceberging&#8221;) makes sense: ten <em>100 ETH</em> buys throughout the day move the price less overall than one big <em>1000 ETH</em> buy.</p><p>For us practitioners, the square-root law suggests using <strong>concave impact models</strong> for cost estimation. Many execution algos (TWAP, POV, etc.) implicitly assume this concavity. In crypto, the square-root law holds qualitatively, although the exact coefficient varies by asset and venue liquidity.</p><h2>Price Impact vs. Cost: Temporary and Permanent</h2><p>Not all price impact is permanent. <strong>Hasbrouck&#8217;s price impact</strong> can be thought of as the <em>permanent</em> price change resulting from information (e.g., informed trading). The remainder of the spread/impact is <em>temporary</em> (due to inventory pressure or bounce). <strong>Realized Spread</strong> vs <strong>Price Impact</strong> is a helpful distinction:</p><ul><li><p><em>Price Impact</em> (per Hasbrouck or others) measures how far the price <em>stays</em> moved after a trade.</p></li><li><p><em>Realized Spread</em> measures the part of the spread captured by liquidity providers, i.e. the profit of a market maker if the price mean-reverts.</p></li></ul><p>We&#8217;ll cover realized spread more shortly &#8211; it&#8217;s a <strong>resiliency</strong> metric (how quickly prices revert) - remember liquidity dimensions from the <a href="https://zerolag.club/p/spread">Spread </a>lecture.</p><blockquote><p>&#128300; <em>Bottom line:</em> Theoretical models give us &#955; (lambda) and other intuition pumps. But to actually <em>measure</em> liquidity on crypto exchanges, we need practical formulas. Let&#8217;s now turn to <strong>hands-on liquidity measures</strong> you can compute from data.</p></blockquote><p></p><h2>&#128736;&#65039; Practical Liquidity Measures</h2><p>The following metrics are the bread-and-butter tools to quantify liquidity and trading costs on CLOBs. You can implement these with exchange data using any language with relative ease.</p><h3>Amihud&#8217;s Illiquidity Ratio (2002)</h3><p>One widely used measure in academia (and increasingly in crypto quant circles) is the <strong>Amihud Illiquidity Ratio</strong>. Proposed by Yakov Amihud, it captures the idea of <strong>price impact per volume</strong>. For a given day <em>d</em>, it&#8217;s defined as:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;ILLIQ_d = \\frac{|R_d|}{Vol_d}&quot;,&quot;id&quot;:&quot;TMKERFBJGZ&quot;}" data-component-name="LatexBlockToDOM"></div><p>where <em><strong>R</strong></em> is the asset&#8217;s return (usually absolute return, in decimal) and <em><strong>Vol</strong></em> is the trading volume <em>in dollar terms</em> that day. If you average this across days in a period, you get an estimate of how much the <strong>price moves per unit of trading volume</strong>. A low Amihud value means <em>high liquidity</em> (you can push a lot of volume for little price change), while a high value means illiquidity.</p><p><em>Usage:</em> Amihud&#8217;s metric is great for comparing assets or exchanges cross-sectionally. <strong>Brauneis et al. (2021)</strong> found that Amihud&#8217;s ratio, despite being simple, does well in ranking the liquidity <em>levels</em> of different crypto exchanges. For example, if <em><strong>Exchange A</strong></em>&#8217;s BTCUSD has half the Amihud value of <em><strong>Exchange B&#8217;</strong></em>s, <em><strong>A</strong></em> generally offers a tighter market with less slippage per trade dollar.</p><blockquote><p>&#128736;&#65039; <em>Code Tip:</em> Given daily OHLCV data, you can compute <code>amihud = (abs(return) / dollar_volume).resample('D').mean()</code> in Python. Make sure to use <strong>dollar volume</strong> (price * quantity) for consistency. Watch out for outliers on days of huge returns but low volume &#8211; consider median or winsorizing in those cases.</p></blockquote><p>However, note that <strong>Amihud is less useful for time-series liquidity changes</strong>. In fast-moving markets, volume and volatility can spike together, making daily illiquidity noisy. For intraday use, a rolling version can be computed (e.g., hourly illiquidity); however, many practitioners prefer direct order book statistics intraday.</p><h3>Volume-Weighted Average Price (VWAP) and Slippage</h3><p><strong>VWAP</strong> &#8211; Volume Weighted Average Price &#8211; is technically a benchmark price, not a liquidity metric in itself. But it&#8217;s crucial in execution. Traders commonly gauge <strong>slippage</strong> by how far their execution price is from the VWAP over the execution interval.</p><p>For example, say you need to buy 50 BTC over 10 minutes. The market&#8217;s VWAP in that 10-minute window (considering all trades) was $30,000. If your average execution price ended up $30,100, then you paid 0.33% above VWAP. That difference is <strong>slippage cost</strong> &#8211; a measure of liquidity <em>immediacy</em> and <em>market impact</em> during your execution.</p><p>VWAP is used as a <strong>benchmark</strong> for <em>immediacy</em> because an <em>uninformed</em> execution spread evenly in time should approximately achieve VWAP (assuming you&#8217;re a small part of the volume). If you trade too aggressively (demanding immediacy), you&#8217;ll push the price and end up worse than VWAP; if you trade too passively or slowly, you might chase a drifting price and also miss VWAP. Thus, VWAP is the bar to beat.</p><blockquote><p>&#128736;&#65039; <em>Practical Use:</em> Many crypto execution algos (TWAP, VWAP algos) aim to <em>track or beat VWAP</em>. As a trader, you measure <em>execution performance</em> as <strong>Implementation Shortfall</strong> (next section) or vs VWAP. Exchanges don&#8217;t give VWAP directly, but you can compute it from trades. In Python, given trades with price and size, <code>vwap_price = (price * size).sum() / size.sum()</code> for the interval.</p></blockquote><h3>Implementation Shortfall (IS)</h3><p>The <strong>Implementation Shortfall</strong> is the Tradfi-s gold-standard metric for <strong>total trading cost </strong>which is adopted by crypto. Originally coined by <strong>Andr&#233; Perold (1988)</strong>, IS is the difference between the <em>paper</em> price when you decide to trade and the <em>actual average price</em> you get. It captures <em>both</em> spread and market impact (and timing delays). It was used to measure the performance of a broker.</p><p>Let&#8217;s break it down: You decide to buy at 10:00 when the midprice is $100. By the time your order fully executes, you ended up paying an average price of $101. Meanwhile, the market mid moved to $100.5 during your execution (maybe due to other traders or your own impact). Your implementation shortfall can be decomposed as:</p><ul><li><p><strong>Spread cost:</strong> If you crossed the spread to buy, maybe half the spread (say $0.1) was lost right away.</p></li><li><p><strong>Impact/timing cost:</strong> The mid moved up $0.5 while you were executing &#8211; that&#8217;s adverse movement against you.</p></li><li><p><strong>Opportunity cost:</strong> If you didn&#8217;t complete the full order, any unfilled part has an implicit cost if price keeps rising.</p></li></ul><p>In total, your IS = $101 &#8211; $100 = $1 (1%) on that trade. This is the <strong>real cost of trading</strong> beyond the ideal scenario. It&#8217;s crucial for algorithmic execution evaluation &#8211; you want to minimize IS.</p><p>Formally, one can write Implementation Shortfall for a buy order as:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;IS = \\frac{P_{avg fill} - P_{decision}}{P_{decision}} \\times 100\\%&quot;,&quot;id&quot;:&quot;KDFPLINIAO&quot;}" data-component-name="LatexBlockToDOM"></div><p>where <em><strong>Pdecision</strong></em>  is the price (mid or last) when the trading decision was made (or a benchmark like previous close), and <em><strong>Pavg&#8201;fill</strong></em> &#8203; is the volume-weighted execution price. For sells, the formula is analogous (you want to sell high; any average fill below decision price is cost).</p><blockquote><p>&#128736;&#65039; <em>Code-Level Tip:</em> To compute IS, you need to record a benchmark price at the start (decision time or arrival price). Then track all fills of the order and compute the size-weighted average fill price. The difference (with correct sign) is your IS. If working with historical data, you can simulate an execution (e.g., splitting into chunks) and compare with a baseline price path. Python&#8217;s pandas can help aggregate fills; just be careful to align timestamps for the benchmark price.</p></blockquote><p><strong>Implementation Shortfall vs VWAP:</strong> If you use VWAP of the period as your benchmark, that variant is often called <em>VWAP slippage</em>. For example, some traders say &#8220;We were 5 bps <em>inside</em> VWAP&#8221; meaning they beat the VWAP by 0.05% (a positive outcome). This is essentially a flavor of implementation shortfall using VWAP as the benchmark instead of the decision price.</p><h3>Realized Spread</h3><p>In Lecture on Spread we introduced <strong>Realized Spread (RS)</strong> &#8211; a metric particularly relevant for market makers and liquidity providers. While <em>effective spread</em> measures the cost paid by <em>takers</em> on a given trade (difference between trade price and midprice at that moment), the <em>realized spread</em> looks ahead: it&#8217;s the difference between the trade price and the midprice <em>after some time &#916;</em>.</p><ul><li><p>If I sell to a market maker at $100 (mid was $99.9, so effective spread paid ~$0.1), and 5 minutes later the midprice is $99.8, the market maker benefited &#8211; they sold higher than the new mid. The realized spread for that trade to the market maker is $100 &#8211; $99.8 = $0.2.</p></li><li><p>Conversely, if the mid jumps to $100.5 after, the market maker&#8217;s gain from the spread was eroded (negative realized spread, meaning the taker&#8217;s trade had information).</p></li></ul><p>Mathematically, for a buy order (taker perspective, <em><strong>d=+1</strong></em>  for buy, <em><strong>&#8722;1</strong></em> for sell):</p><p>Realized Spreadt,</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\text{Realized Spread}_{t,\\Delta} = d \\Big(p_t - m_{t+\\Delta}\\Big)&quot;,&quot;id&quot;:&quot;OUAXUXFGPA&quot;}" data-component-name="LatexBlockToDOM"></div><p>where <strong>pt</strong> is the trade price and <em><strong>mt+&#916;</strong></em> &#8203; the midprice &#916; time later. The choice of &#916; (e.g. 1 minute, 5 minutes) is critical &#8211; too short and it&#8217;s mostly noise, too long and other factors move the price.</p><p>For a <em>liquidity taker</em>, a <strong>small realized spread</strong> (or negative) means you implicitly didn&#8217;t pay much extra beyond the true price impact. For a <strong>market maker</strong>, a large positive realized spread means you earned your quoted spread and the price didn&#8217;t run away on you &#8211; a good trade. Realized spread thus measures <strong>resiliency</strong> of the market: how quickly prices mean-revert after trades. A resilient market where liquidity replenishes quickly tends to have lower permanent impact (and higher realized spread for makers).</p><blockquote><p>&#128736;&#65039; <em>How to estimate:</em> Using trade and quote data, for each trade record the midprice some minutes later. Compute <em><strong>d(p&#8722;mfuture) </strong></em>and average over many trades (you might condition on trade size or time of day). In Python, you can do this by merging the trade tape with a delayed midprice series. This helps quantify <em>how much of the spread is &#8220;real&#8221; vs just temporary</em>. Low realized spread (relative to effective spread) means <strong>information-heavy trades</strong> (price moved against the maker), whereas high realized spread means mostly noise or inventory trading.</p></blockquote><p></p><h3>Low-Frequency Liquidity Proxies</h3><p>Thus far, we discussed measures you&#8217;d compute if you have tick-level data. What if you only have daily or hourly bars? That&#8217;s not our typical trading case though  - we usually operate hft or mid-frequency (minutes).  But in some cases, you need that data - for example, in mid-low frequency strategies like funding arbitrage, you want to know what venues to trade if there are the same assets with almost the same funding rates, but the liquidity differs. There are clever <strong>spread estimators</strong> that use low-frequency data to infer liquidity. Interestingly, some of these have been tested on crypto and work quite well. Here are two notable ones:</p><ul><li><p><strong>Corwin&#8211;Schultz estimator (2012):</strong> Uses daily high and low prices to estimate the bid-ask spread. The intuition: high prices are usually buyer-initiated trades and lows are seller-initiated, so the ratio of highs/lows over two days contains information about the spread. The formula is a bit involved (it uses the difference between single-day and two-day ranges to back out the spread). Corwin-Schultz is <em>cheap to compute</em> &#8211; you just need High and Low for two days, making it handy for quick comparisons. <strong>Brauneis et al. (2021)</strong> found this estimator excels at tracking <em>time-series liquidity changes</em> in BTC and ETH markets. That means if liquidity is drying up, the C-S measure will rise, and vice versa, roughly in sync with true spreads.</p></li><li><p><strong>Abdi&#8211;Ranaldo estimator (2017):</strong> An improvement over C-S, this uses <strong>Close, High, and Low</strong> prices to estimate spreads<a href="https://www.aeaweb.org/conference/2017/preliminary/paper/GbeDTRrB#:~:text=To%20estimate%20the%20bid,ask">aeaweb.org</a>. It&#8217;s also designed for daily data but tends to be more accurate by incorporating closing price information (reducing bias from day gaps). Abdi&#8211;Ranaldo also performed very well in crypto, slightly outdoing C-S in some cases. If you have daily OHLC, this is a great proxy for average bid-ask spreads without needing tick data.</p></li></ul><p>Both of these give an <strong>estimated percentage spread</strong>. For example, Abdi&#8211;Ranaldo might estimate that an exchange&#8217;s typical spread is 0.20%. They won&#8217;t capture depth or large-order costs, but they reflect top-of-book tightness over time.</p><p>Other proxies include <strong>&#8220;Number of Trades&#8221;</strong> or <strong>Dollar Volume</strong> (more volume often means more liquidity) and variants of <strong>high-low volatility measures</strong>. Brauneis et al. tested a bunch. Two highlights from their findings worth noting:</p><ul><li><p>For <strong>time-series liquidity (dynamic changes)</strong>: <em>Corwin-Schultz and Abdi-Ranaldo were the best</em>, indicating these high-low based measures track liquidity over time better than, say, Amihud or trade counts. This is likely because high-low ranges widen when volatility and trading costs spike, signaling illiquidity in turbulent times.</p></li><li><p>For <strong>cross-sectional and level estimates</strong>: <em>Amihud&#8217;s illiquidity and a proxy from Kyle &amp; Obizhaeva (2016) invariance</em> were most reliable. The &#8220;Kyle-Obizhaeva estimator&#8221; they used is rooted in invariance theory &#8211; it scales volume and volatility to produce a liquidity metric (think of it as a predicted impact cost per trade). Those two measures were best at ranking exchanges by liquidity and even approximating absolute spread levels. So if you want to know <em>which exchange is most liquid</em> or <em>what&#8217;s the typical cost on Exchange X vs Y</em>, Amihud and invariance-based metrics give a good gauge.</p></li></ul><blockquote><p>&#128300; <em>Reality check:</em> Low-frequency proxies are great for research and monitoring broad trends or doing comparative studies when tick data isn&#8217;t available. But if you <strong>do have order book data</strong>, you&#8217;ll always get a more precise read from direct measures (actual quoted spreads, depth, etc.). Use proxies when you must (e.g. analyzing hundreds of altcoins quickly, or historical periods where only daily data exists).</p></blockquote><h3>Which Metrics Matter in Production?</h3><p>Let&#8217;s summarize from a practitioner&#8217;s perspective &#8211; <strong>what should you actually use when trading crypto?</strong></p><ul><li><p><strong>Quoted Spread and Order Book Depth:</strong> These are still king for real-time decisions. A tight normalized spread and substantial depth at the top of book mean you can execute small trades cheap. For larger trades, look at <em>impact</em> &#8211; e.g., how much the price moves if you sweep X dollars of the book (we covered weighted spreads in Lecture 1 and will delve more into slippage in future). These are <strong>immediately usable</strong> via exchange APIs.</p></li><li><p><strong>Implementation Shortfall:</strong> If you&#8217;re executing large orders or running an algorithm, track IS on each order or day. It&#8217;s the true bottom-line cost including all slippage. In a live trading system, you&#8217;d log the decision price and fills to compute this. If your IS starts creeping up, it might indicate deteriorating liquidity or an execution problem.</p></li><li><p><strong>VWAP Slippage:</strong> This is often used in <strong>TCA (Transaction Cost Analysis)</strong> reports. Institutional traders will report &#8220;We executed at 5 bps worse than VWAP&#8221; for example. If you&#8217;re building execution algos, minimizing VWAP slippage (or beating VWAP) is a concrete goal.</p></li><li><p><strong>Amihud Ratio:</strong> For strategy research or asset selection, Amihud&#8217;s illiquidity is handy to rank assets by liquidity. It&#8217;s simple to compute and has intuitive units (percent move per $ traded). It&#8217;s not something you&#8217;d compute intraday for signals, but good for filtering out illiquid coins or deciding how to allocate capital across venues.</p></li><li><p><strong>Hasbrouck&#8217;s Lambda / Kyle&#8217;s Lambda:</strong> In high-frequency strategy dev, you might estimate these to understand impact. For example, if lambda for a coin is huge, you know even small trades will move it &#8211; be careful with order sizing. But these are <strong>diagnostic</strong>; we don&#8217;t plug Hasbrouck&#8217;s VAR into a live system due to complexity and noise. Instead, simpler real-time estimators (like moving average of effective cost per trade size) are preferred.</p></li><li><p><strong>Roll&#8217;s Measure:</strong> Honestly, pretty <strong>obsolete for crypto</strong>. It&#8217;s elegant for teaching and for some stock datasets, but as we emphasized, it often gives false signals in crypto markets. Use it only if you suspect purely random trade directions and want a quick guess of spread &#8211; and be ready for it to output zero when there&#8217;s persistent trending.</p></li><li><p><strong>Corwin-Schultz / Abdi-Ranaldo:</strong> These are <strong>great for analysis</strong> &#8211; e.g., if you&#8217;re writing a report on historical liquidity or can&#8217;t pull tick data for years of history. They aren&#8217;t something a trading algorithm would use on the fly (they&#8217;re too laggy and coarse for that). Think of them as research tools or for monitoring market health over time. For instance, you could plot a 30-day moving average of C-S estimator to visualize how an exchange&#8217;s liquidity is improving or worsening.</p></li></ul><p>In crypto markets, <strong>latency and explicit order book info reign supreme</strong>. We have full Level-2 data available, unlike some traditional markets where proxies were invented due to data scarcity. This means our production-grade metrics lean toward <em>direct measurements</em> (spreads, depth, fill stats). However, the theoretical concepts and low-frequency proxies are still invaluable for <strong>validation and understanding</strong>. They can validate if your direct measures make sense, or help compare liquidity across venues without streaming all their data.</p><div><hr></div><p>As a final note, we deliberately left out Automated Market Makers here. AMMs (like Uniswap) have <strong>explicit liquidity curves</strong> and different metrics (like pool depth, k-values, etc.), which deserve their own discussion. Fear not &#8211; we shall tackle AMM liquidity in a later lecture.</p><p>For now, you should be equipped to measure and monitor liquidity on any crypto exchange with a limit order book. May your order placements be swift and your spreads ever tight!</p><p>See you in the next lecture, we&#8217;ll start diving straight into market-making models.</p><h4>References</h4><ul><li><p>Roll, R. (1984). <em>A simple implicit measure of the effective bid&#8211;ask spread in an efficient market</em>. Journal of Finance, <strong>39</strong>(4), 1127&#8211;1139.</p></li><li><p>Kyle, A. S. (1985). <em>Continuous auctions and insider trading</em>. Econometrica, <strong>53</strong>(6), 1315&#8211;1335. (Introduced Kyle&#8217;s lambda)</p></li><li><p>Kyle, A. S., &amp; Obizheva, A. A. (2016). <em>Market Microstructure Invariance: Empirical Hypotheses</em>. Econometrica, <strong>84</strong>(4), 1345&#8211;1404. (Foundation of the square-root impact law)</p></li><li><p>Hasbrouck, J. (1991). <em>Measuring the information content of stock trades</em>. Journal of Finance, <strong>46</strong>(1), 179&#8211;207.</p></li><li><p>Amihud, Y. (2002). <em>Illiquidity and stock returns: cross-section and time-series effects</em>. Journal of Financial Markets, <strong>5</strong>(1), 31&#8211;56.</p></li><li><p>Perold, A. (1988). <em>The Implementation Shortfall: Paper vs Reality</em>. Journal of Portfolio Management, <strong>14</strong>(3), 4&#8211;9.</p></li><li><p>Corwin, S. A., &amp; Schultz, P. (2012). <em>A simple way to estimate bid-ask spreads from daily high and low prices</em>. Journal of Finance, <strong>67</strong>(2), 719&#8211;759.</p></li><li><p>Abdi, F., &amp; Ranaldo, A. (2017). <em>A simple estimation of bid-ask spreads from daily close, high, and low prices</em>. Review of Financial Studies, <strong>30</strong>(12), 4437&#8211;4480.</p></li><li><p>Brauneis, A., Mestel, R., Riordan, R., &amp; Theissen, E. (2021). <em>How to measure the liquidity of cryptocurrency markets?</em> Journal of Banking &amp; Finance, <strong>122</strong>, 106198.</p></li></ul>]]></content:encoded></item><item><title><![CDATA[Lecture 1: Spread]]></title><description><![CDATA[A practical series for the discerning retail trader and the quantitative alchemist on Market Microstructure]]></description><link>https://zerolag.club/p/spread</link><guid isPermaLink="false">https://zerolag.club/p/spread</guid><dc:creator><![CDATA[crypt0grapher]]></dc:creator><pubDate>Thu, 26 Jun 2025 21:41:44 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/e6e84644-f180-4406-a167-920c25440477_1536x1024.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>&#128367;&#65039;Greetings, esteemed reader! <br><br>What I shall commence with is the study of <strong>Market Microstructure</strong>. I crafted it to be useful for both the discerning retailer of trade and the adept quantitative alchemist.</p><p>I&#8217;d classify these series as <em><strong>Lectures</strong></em>. <br>As a practitioner to the bone, I&#8217;ll try to convey only things you can use in the code, but achieving results as a quant requires some theory and mental effort.</p><h4>Markers</h4><ul><li><p>&#128300;: theory-heavy concepts useful for context, but less directly monetizable.</p></li><li><p> &#128736;&#65039;: hands-on code, heuristics, or tips.</p></li></ul><h4>Required Knowledge</h4><p>Basic order&#8209;book vocabulary and high-school math.</p><p>If the pendulum of your interest has swung into deeper research, you can find articles and book references at the end.</p><h4>Outcome</h4><p>A working knowledge toolkit to measure spread.</p><ul><li><p>Markets: CLOBs, RFQ, AMMs, hybrids.</p></li><li><p>Liquidity</p></li><li><p>Bid&#8211;Ask Spread: Quoted, Normalized, Effective, Realized.</p></li><li><p>Order Book Depth and Slippage</p></li></ul><h3>&#128293; Shall we commence? </h3><p>Let&#8217;s start with fundamental definitions and basic yet practical models, laying the groundwork for everything subsequent.</p><p>I treat <em><strong>Market Microstructure</strong></em> as the area of study of how orders are placed, matched, and cancelled, and how that process shapes prices, liquidity, volatility, and trading costs. This semi-formal definition pretty much reflects the concept of a bridge between the raw bid/ask queue and the asset price. </p><h2>Market Designs</h2><h3>CLOBs</h3><p>C<em><strong>ontinuous limit&#8209;order books</strong></em> run by most of the venues we trade on: Coinbase, Binance, Kraken, Bybit, Hyperliquid, NASDAQ, NYSE, LSE, TSE, and pretty much most of them.<br>This is often taken for granted, but it is essential to note that CLOB is not the only market type out there. </p><h3>AMMs</h3><p>On-chain DEXes, these implementations are considerably different: constant product, concentrated liquidity, balancer, hybrid pools, - hopefully I&#8217;ll be persistent enough with this blog to share what I know for all of them as I am using all of the versions of Uniswap (v2,v3, and v4) and Solana DEXes.</p><h3>RFQ </h3><p>RFQs markets worth mentioning: one party asks for a quote; counterparty responds with a price - as simple as that. That&#8217;s the case for onchain protocols like 0x RFQ, Paradigm, and DeFi aggregators. On the centralized side, that&#8217;s the structure of OTC desks like Binance&#8217;s and Wintermute. Bloomberg is an example from the TradFi space.</p><h3>OTC</h3><p>&#1057;all/batch auctions is the last market type I&#8217;d like to mention since it&#8217;s a popular token fair launch idea - bids and asks are being collected, then at once they are matched against each other. In TradFi, that&#8217;s how earlier NYSE and LSE operated; they matched orders once a day. Since CEXes offer APIs for call auctions, that might move one&#8217;s thoughts in the right direction! I might create a post on that later.</p><p>For the sake of clarity, there are other market designs (dealer markets, batch auctions, RFQs, dark pools, prediction markets, and a variety of hybrids). The list depends on one&#8217;s fantasy, since more market designs appear like Order Flow Auctions with MEV redistribution, and obviously, you can design your own market. We&#8217;ll stick to CLOBs and AMMs for now. </p><h2>Four Dimensions of Liquidity</h2><p>So <em><strong>market liquidity, or </strong></em>just<em><strong> liquidity </strong></em>(when I&#8217;ll be talking about funding liquidity or monetary liquidity, I&#8217;ll say that explicitly), is, semi-formally, the ability to facilitate an asset's quick trading without significantly impacting its price.<br>In a more structured way, modern theory defines the following liquidity dimensions:</p><ul><li><p><strong>tightness</strong>: cost of executing a small trade, <br>&#128736;&#65039; we&#8217;ll measure that by spread.</p></li><li><p>depth: volume near the current price that can be traded without moving it,<br>&#128736;&#65039; order book sizing &amp; slippage curves;</p></li><li><p>immediacy:  order execution speed,<br>&#128736;&#65039; VWAP lag, time component of the Implementation Shortfall;</p></li><li><p>resiliency: how quickly prices revert and liquidity replenishes after shocks,<br>&#128736;&#65039; Realized spread, book recovery time.</p></li></ul><p><em><strong>Why care?</strong></em> Because execution costs often dwarf model alpha, especially in HFT.<br>We choose markets to trade on, design execution algos, and size orders.  Bad execution ruins an excellent strategy. </p><h2>Spread Measures</h2><h3>Quoted Spread</h3><p>Ok, so you, of course, know what a spread is - the best ask less than the best bid:</p><p>That&#8217;s the <strong>quoted spread</strong> <em>S</em>  at time <em>t:</em></p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;S_t \\equiv a_t - b_t&quot;,&quot;id&quot;:&quot;PVYPOCXNYG&quot;}" data-component-name="LatexBlockToDOM"></div><p>That absolute value doesn&#8217;t give much: a 10&#162; spread is typical for APPL but crazy for CRV,  that&#8217;s why normalizing quoted spread with the <em><strong>midprice</strong></em> <em><strong>m</strong></em> sounds reasonable, providing the relative quoted spread <em>s</em>:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;s_t \\equiv \\frac{S_t}{m_t} \\hspace{2em} m_t \\equiv\\frac{a_t+b_t}{2}&quot;,&quot;id&quot;:&quot;NDLUSOTSRV&quot;}" data-component-name="LatexBlockToDOM"></div><p>Spread, being the most cited measure of <strong>market &lt;il&gt;liquidity</strong>, actually does work very well. It&#8217;s a valid tx cost model for a tiny round-trip transaction that's executed at the best bid and ask, which also assumes immediate (zero-lag!) order book feeds and execution. </p><blockquote><p>&#128736;&#65039;  Extremely handy for realtime lowlatency monitoring: tight normalized spread &#8658; high liquidity</p></blockquote><p>I know, &#8220;immediate execution&#8221;&nbsp;and &#8220;immediate feeds&#8221; sound like a &#8220;<em>spherical horse in a vacuum</em>&#8221;, but quoted normalized spread is actually a good ultrarobust liquidity estimator. </p><h3>Weighted Average Spread</h3><p>Weighted average bid-ask spread for an order size <em>q</em>: Assuming that <em>a</em> and <em>b are average execution prices </em>of buy and sell orders, respectively<em>, </em>then<em> the </em><strong>weighted-average bid-ask spread </strong>is</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;S_t(q) \\equiv \\frac{\\overline{a_t}(q) - \\overline{b_t}(q)}{m_t}&quot;,&quot;id&quot;:&quot;WGTEKGOADG&quot;}" data-component-name="LatexBlockToDOM"></div><p>It's pretty intuitive: the market is not deep enough if, with the growth of&nbsp;<strong>q, the wa-spread&nbsp;</strong><em><strong>s&nbsp;</strong></em>is increasing significantly&nbsp;as well. When&nbsp;<em><strong>q</strong></em>&nbsp;is small, it&#8217;s close to the quoted spread; for larger&nbsp;<em><strong>q,&nbsp;</strong></em>it reflects depth and slippage.</p><p>Estimating these weighted average best bids/asks is a real way to estimate slippage (more on slippage is coming) and model the liquidity surface (for large orders).</p><blockquote><p>Subscribe to the depth feed and compute:</p><ol><li><p>Choose a notional <em><strong>q</strong></em> you plan to trade.</p></li><li><p>Walk the book&#8217;s asks from the best price upward until you accumulate <em><strong>q </strong></em>to get <em><strong>a(q). </strong></em>Same for bids downward to get <em><strong>b(q).</strong></em></p></li><li><p>Compute midprice and plug into the above formula.</p></li></ol></blockquote><p>In the end of this lecture you&#8217;ll find the code to compute that.<br></p><h3>Effective Spread</h3><p>Requires much less data - just the last execution price,&nbsp;<em><strong>p,&nbsp;</strong></em>showing the transaction's impact on the market.<em><strong><br>d </strong></em>is the trade direction 1 for longs/buys, -1 for shorts/sells</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;S_e \\equiv d\\frac{p-m}{m}&quot;,&quot;id&quot;:&quot;WMZOQQYVTN&quot;}" data-component-name="LatexBlockToDOM"></div><h3>Realized Spread</h3><p>Now we&#8217;re getting close to business.  Quoted and Effective spreads are more measures for  a trader, realized spread is more interesting for market makers, as a MM you want to be as neutral as possible, and RS measures the extra cost (or profit) sustained by a MM relative to an ideal environment in which trades are made at the midprice. It assumes we keep the assets for &#916; periods, then a <strong>realized spread</strong></p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;S_r \\equiv d_t(p_t - m_{t+\\Delta}) = d_t(p_t-m_t) - d_t(m_{t+\\Delta} - m_t)&quot;,&quot;id&quot;:&quot;HZMUKUUVQN&quot;}" data-component-name="LatexBlockToDOM"></div><p>Thus, the average RS, given the above effective spread definition,</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;E(S_r) = E(S_e) - E(d_t(m_{t+\\Delta}-m_t))&quot;,&quot;id&quot;:&quot;NIFEURVAFC&quot;}" data-component-name="LatexBlockToDOM"></div><p></p><h2><strong>Order Book Depth and Slippage</strong></h2><blockquote><p>&#128736;&#65039;  Capacity to absorb large orders. Sizing, scaling, and choosing venues to trade are one of the most complex tasks in algotrading.</p></blockquote><p><em><strong>Depth</strong></em> captures the cumulative quantity available for execution at, and away from, the best bid and ask, while <em><strong>slippage</strong></em> denotes the adverse price movement a participant experiences when their order consumes that liquidity. </p><p>In further lectures, I&#8217;m going to walk through commonly used aggregates&#8212;top&#8209;of&#8209;book depth, X&#8209;basis&#8209;point depth, and depth&#8209;implied dollar value, as well as decay models that describe replenishment rates. We then derive instantaneous and execution-weighted slippage measures. <br><br>&#128300; Literature on microstructure offers ways to estimate the direction of trade <em><strong>d: </strong></em>classic Lee-Ready algorithm and Odders-White. It doesn&#8217;t make much sense since all cryptocurrency exchanges now provide aggressiveness flags to indicate whether an order was initiated by the seller or buyer, and we always have quotes. </p><p>In the literature on microstructure, there is a lot of information on Roll&#8217;s measure (1984), which is an estimator of the spread derived from the price time series.  The problem with it is that it doesn&#8217;t work in crypto - momentum crashes the idea.  It uses the negative autocovariance of successive price changes. It has lots of assumptions, and in trending markets or momentum, when there&#8217;s positive autocorrelation, it gives zero or wrong estimates. If you'd like to read more about it, I leave references below.<br>I&#8217;ve outlined how it works in <a href="https://zerolag.club/i/166917360/rolls-model-implied-bidask-spread">my post on liquidity measures</a>, since it&#8217;s very neat and helps a lot to grasp the concept of mm modelling. Check it out.<br><br>Thanks for reading!</p><p>Now, let&#8217;s get straight to the <strong>Liquidity Measures</strong> and  code some money&#8209;making spells! &#128184;&#129668;</p><h4>References and Reading List</h4><ul><li><p>Kyle, A. S.&#8239;(1985). Continuous auctions and insider trading. *Econometrica, 53*(6), 1315&#8209;1335.</p></li><li><p>Lee, C. M. C., &amp; Ready, M. J.&#8239;(1991). Inferring trade direction from intraday data. *Journal of Finance, 46*(2), 733&#8209;746.</p></li><li><p>O&#8217;Hara, M.&#8239;(1995). *Market Microstructure Theory*. Blackwell.</p></li><li><p>Roll, R.&#8239;(1984). A simple implicit measure of the effective bid&#8209;ask spread in an efficient market. *Journal of Finance, 39*(4), 1127&#8209;1139.</p></li><li><p>Harris, L.&#8239;(2003). Trading and Exchanges: Market Microstructure for Practitioners. Oxford University Press.</p></li><li><p>Foucault, Pagano, R&#246;ell. Market Liquidity: Theory, Evidence, and Policy (2013), Oxford University Press.</p></li><li><p>Obizhaeva, A., &amp;&#8239;Wang, J.&#8239;(2013). Optimal trading strategy and supply/demand dynamics. *Journal of Financial Markets, 16*(1), 1&#8209;32.</p></li></ul><p></p><h4>Weighted Average Spread Code </h4><p>Here I&#8217;m sharing a simple but effective code snippet that does two things:</p><ol><li><p><strong>Generates a realistic random order&#8209;book snapshot</strong> (bids&#8239;+&#8239;asks) with tick&#8209;aligned prices and size that decays deeper in the book.</p></li><li><p><strong>Computes the weighted&#8209;average bid&#8209;ask spread for any trade size&#8239;q</strong> as defined above.</p></li></ol><p>Python, requires <code>pandas</code> and <code>numpy</code></p><pre><code>import pandas as pd
import numpy as np

###############################################################################
# 1.  ORDER&#8209;BOOK GENERATOR
###############################################################################
def random_orderbook(
    mid: float = 100.0,               # central price around which we build the book
    tick: float = 0.01,               # price granularity
    spread_ticks: int = 2,            # best&#8209;bid/ask gap in ticks
    depth_levels: int = 20,           # levels per side
    base_size: float = 1_000.0,       # expected size at the best bid/ask
    depth_decay: float = 0.15,        # how quickly size grows deeper in book
    sigma_vol: float = 0.5,           # randomness in size (log&#8209;normal std&#8209;dev)
    rng: np.random.Generator | None = None
) -&gt; pd.DataFrame:
    """
    Build a one&#8209;shot synthetic order book with realistic features:

    &#8226; tick&#8209;aligned prices, symmetric around 'mid'
    &#8226; quoted spread = spread_ticks * tick
    &#8226; depth increases (on average) as we move away from the top
    &#8226; log&#8209;normal noise to avoid perfectly smooth shapes
    """
    rng = rng or np.random.default_rng()
    half_spread = (spread_ticks * tick) / 2

    # --- price ladders -------------------------------------------------------
    ask_px = mid + half_spread + tick * np.arange(depth_levels)
    bid_px = mid - half_spread - tick * np.arange(depth_levels)

    # --- sizes: grow with depth + randomness ---------------------------------
    vol_multiplier = rng.lognormal(mean=0.0, sigma=sigma_vol, size=depth_levels)
    depth_factor = np.exp(depth_decay * np.arange(depth_levels))
    ask_sz = base_size * depth_factor * vol_multiplier
    bid_sz = base_size * depth_factor * vol_multiplier      # symmetric book

    asks = pd.DataFrame({"side": "ask", "price": ask_px, "size": ask_sz})
    bids = pd.DataFrame({"side": "bid", "price": bid_px, "size": bid_sz})

    return pd.concat([bids, asks], ignore_index=True)


###############################################################################
# 2.  EXECUTION&#8209;PRICE ROUTINE
###############################################################################
def _avg_exec_price(book: pd.DataFrame, side: str, qty: float) -&gt; float:
    """
    Walk the book and compute the volume&#8209;weighted average execution
    price for either a buy ('ask' side) or sell ('bid' side) order of size 'qty'.
    """
    side_book = book.query("side == @side").copy()

    # For buys we start at BEST ASK (lowest); for sells at BEST BID (highest)
    side_book = side_book.sort_values(
        "price", ascending=(side == "ask")  # True&#8594;asks low&#8594;high ; False&#8594;bids high&#8594;low
    ).reset_index(drop=True)

    cum = side_book["size"].cumsum()
    if qty &gt; cum.iat[-1]:
        raise ValueError("Requested quantity exceeds book depth.")

    take_full = side_book.loc[cum &lt; qty, ["price", "size"]]
    take_part = side_book.loc[cum &gt;= qty].iloc[0]

    filled_qty = take_full["size"].sum()
    remaining = qty - filled_qty

    vwap_numer = (take_full["price"] * take_full["size"]).sum()
    vwap_numer += take_part["price"] * remaining

    return vwap_numer / qty


###############################################################################
# 3.  WEIGHTED&#8209;AVERAGE SPREAD FOR ANY SIZE q
###############################################################################
def wa_spread(book: pd.DataFrame, q: float) -&gt; float:
    """
    Weighted&#8209;average bid&#8209;ask spread *in absolute price units*.
    Divide by the mid&#8209;price if you prefer it in relative terms.
    """
    a_q = _avg_exec_price(book, "ask", q)   # cost to BUY  q
    b_q = _avg_exec_price(book, "bid", q)   # proceeds to SELL q
    mid = (book.query("side == 'ask'")["price"].min() +
           book.query("side == 'bid'")["price"].max()) / 2
    return (a_q - b_q) / mid                # relative form (dimensionless)

###############################################################################
# 4.  QUICK DEMO (comment out when importing!)
###############################################################################
if __name__ == "__main__":
    book = random_orderbook()
    for q in [1_000, 5_000, 15_000]:
        print(f"q={q:&gt;6}:  wa&#8209;spread = {wa_spread(book, q):.4%}")</code></pre><p>Here&#8217;s the same for Rustaceans</p><p><code>src/lib.rs</code></p><pre><code>use rand::prelude::*;
use rand_distr::{Distribution, LogNormal};

/// One price/size point on a side of the book.
#[derive(Clone, Copy)]
pub struct Level {
    pub price: f64,
    pub size:  f64,
}

/// Sides we can walk.
#[derive(Clone, Copy)]
pub enum Side { Bid, Ask }

/// A synthetic order&#8209;book snapshot.
pub struct OrderBook {
    bids: Vec&lt;Level&gt;,   // sorted high &#8594; low
    asks: Vec&lt;Level&gt;,   // sorted low  &#8594; high
}

impl OrderBook {
    // ------------------------------------------------------------------------
    /// Create a random, *symmetric* order book around `mid`.
    pub fn random(
        mid: f64,
        tick: f64,
        spread_ticks: usize,
        depth_levels: usize,
        base_size: f64,
        depth_decay: f64,
        sigma_vol: f64,
        rng: &amp;mut impl Rng,
    ) -&gt; Self {
        let half_spread = (spread_ticks as f64 * tick) / 2.0;
        let lognorm = LogNormal::new(0.0, sigma_vol).unwrap();

        // Pre&#8209;allocate to avoid re&#8209;allocations
        let mut bids = Vec::with_capacity(depth_levels);
        let mut asks = Vec::with_capacity(depth_levels);

        for lvl in 0..depth_levels {
            let depth_fac = (depth_decay * lvl as f64).exp();
            let noise      = lognorm.sample(rng);
            let size       = base_size * depth_fac * noise;

            // Bid ladder: highest price first
            let bid_price  = mid - half_spread - tick * lvl as f64;
            bids.push(Level { price: bid_price, size });

            // Ask ladder: lowest price first
            let ask_price  = mid + half_spread + tick * lvl as f64;
            asks.push(Level { price: ask_price, size });
        }
        Self { bids, asks }
    }

    // ------------------------------------------------------------------------
    /// Internal helper: average execution price for buying/selling `qty`.
    fn avg_exec_price(&amp;self, side: Side, qty: f64) -&gt; Option&lt;f64&gt; {
        let book = match side { Side::Bid =&gt; &amp;self.bids, Side::Ask =&gt; &amp;self.asks };
        let mut remaining = qty;
        let mut vwap_num  = 0.0;

        for lvl in book {
            if remaining &lt;= 0.0 { break; }
            let take = remaining.min(lvl.size);
            vwap_num += lvl.price * take;
            remaining -= take;
        }
        if remaining &gt; 1e-9 {
            None // depth exhausted
        } else {
            Some(vwap_num / qty)
        }
    }

    // ------------------------------------------------------------------------
    /// Relative weighted&#8209;average spread for order size `q`.
    pub fn wa_spread(&amp;self, q: f64) -&gt; Option&lt;f64&gt; {
        let a_q = self.avg_exec_price(Side::Ask, q)?;
        let b_q = self.avg_exec_price(Side::Bid, q)?;
        let best_ask = self.asks.first()?.price;
        let best_bid = self.bids.first()?.price;
        let mid      = 0.5 * (best_ask + best_bid);
        Some((a_q - b_q) / mid) // dimensionless
    }
}</code></pre><p>A quick demo</p><p><code>src/main.rs</code></p><pre><code>use rand::SeedableRng;
use rand::rngs::StdRng;
use orderbook_spread::OrderBook;

fn main() {
    let mut rng = StdRng::seed_from_u64(42);

    let book = OrderBook::random(
        100.0,   // mid
        0.01,    // tick
        2,       // spread in ticks
        20,      // depth levels
        1_000.0, // size at top of book
        0.15,    // depth&#8209;decay
        0.5,     // log&#8209;normal sigma
        &amp;mut rng,
    );

    for q in [1_000.0, 5_000.0, 15_000.0] {
        match book.wa_spread(q) {
            Some(s) =&gt; println!("q = {:&gt;6.0}:  wa&#8209;spread = {:.4}%", q, 100.0 * s),
            None    =&gt; println!("q = {:&gt;6.0}:  not enough depth", q),
        }
    }
}</code></pre><p><code>Cargo.toml</code></p><pre><code><code>[package]
name = "orderbook_spread"
version = "0.1.0"
edition = "2021"

[dependencies]
rand = "0.8"
rand_distr = "0.4"</code></code></pre><p>Run and enjoy!</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://zerolag.club/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Zero Lag Club! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item></channel></rss>