<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[The Data Crab]]></title><description><![CDATA[The Data Crab]]></description><link>https://thedatacrab.com</link><generator>RSS for Node</generator><lastBuildDate>Sun, 19 Apr 2026 03:22:37 GMT</lastBuildDate><atom:link href="https://thedatacrab.com/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Building a High-Performance Company Name Standardizer with Rust, React, and the SEC API]]></title><description><![CDATA[Data Engineers know the pain: you receive a dataset where one row says "JPMC", another says "J.P. Morgan", and a third says "JPMorgan Chase & Co.". To perform any meaningful aggregation, you need to normalize these into a single "Golden Record."
Whil...]]></description><link>https://thedatacrab.com/building-a-high-performance-company-name-standardizer-with-rust-react-and-the-sec-api</link><guid isPermaLink="true">https://thedatacrab.com/building-a-high-performance-company-name-standardizer-with-rust-react-and-the-sec-api</guid><category><![CDATA[data-engineering]]></category><category><![CDATA[Rust]]></category><category><![CDATA[React]]></category><category><![CDATA[nlp]]></category><dc:creator><![CDATA[The Data Crab]]></dc:creator><pubDate>Mon, 12 Jan 2026 10:34:16 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1768213982605/c59b95cf-b213-4b47-a2b2-e476057cf4ba.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Data Engineers know the pain: you receive a dataset where one row says <strong>"JPMC"</strong>, another says <strong>"J.P. Morgan"</strong>, and a third says <strong>"JPMorgan Chase &amp; Co."</strong>. To perform any meaningful aggregation, you need to normalize these into a single "Golden Record."</p>
<p>While Python (<code>scikit-learn</code> / <code>fuzzywuzzy</code>) is the go-to for this, it can be slow at scale and requires heavy dependencies.</p>
<p>In this post, I’ll show you how to build a <strong>blazing fast, production-grade Standardization Engine</strong> using <strong>Rust</strong> for the backend and <strong>React</strong> for the UI. We will go beyond simple regex and implement:</p>
<ol>
<li><p><strong>Real-time Master Data</strong>: Fetching live tickers from the <strong>SEC (US Securities and Exchange Commission)</strong>.</p>
</li>
<li><p><strong>Fuzzy Logic</strong>: Using the <strong>Jaro-Winkler</strong> algorithm to handle typos ("Amzn" → "Amazon").</p>
</li>
<li><p><strong>Modern UI</strong>: A polished React frontend with <strong>Dark Mode</strong> and <strong>CSV Export</strong>.</p>
</li>
</ol>
<hr />
<h2 id="heading-the-architecture">🏗️ The Architecture</h2>
<ul>
<li><p><strong>Backend</strong>: Rust (<code>Actix-web</code>)</p>
<ul>
<li><p><strong>Speed</strong>: Handles thousands of requests per second.</p>
</li>
<li><p><strong>Logic</strong>: <code>strsim</code> crate for string distance calculations.</p>
</li>
<li><p><strong>Data</strong>: Fetches <code>company_tickers.json</code> from <a target="_blank" href="http://SEC.gov">SEC.gov</a> on startup.</p>
</li>
</ul>
</li>
<li><p><strong>Frontend</strong>: React (<code>Vite</code>)</p>
<ul>
<li><strong>UX</strong>: Real-time lookup, Batch CSV processing, Dark/Light mode toggle.</li>
</ul>
</li>
</ul>
<hr />
<h2 id="heading-part-1-the-rust-backend">🚀 Part 1: The Rust Backend</h2>
<p>We use <code>Actix-web</code> for the server and <code>strsim</code> for the fuzzy matching logic.</p>
<h3 id="heading-1-initialize-the-project">1. Initialize the Project</h3>
<pre><code class="lang-bash">mkdir company-standardizer
<span class="hljs-built_in">cd</span> company-standardizer
cargo new backend
<span class="hljs-built_in">cd</span> backend
</code></pre>
<h3 id="heading-2-dependencies-cargotoml">2. Dependencies (<code>Cargo.toml</code>)</h3>
<p>We need <code>reqwest</code> to fetch the SEC data and <code>strsim</code> for the math.</p>
<pre><code class="lang-ini"><span class="hljs-section">[package]</span>
<span class="hljs-attr">name</span> = <span class="hljs-string">"backend"</span>
<span class="hljs-attr">version</span> = <span class="hljs-string">"0.1.0"</span>
<span class="hljs-attr">edition</span> = <span class="hljs-string">"2021"</span>

<span class="hljs-section">[dependencies]</span>
<span class="hljs-attr">actix-web</span> = <span class="hljs-string">"4"</span>
<span class="hljs-attr">actix-cors</span> = <span class="hljs-string">"0.6"</span>
<span class="hljs-attr">serde</span> = { version = <span class="hljs-string">"1.0"</span>, features = [<span class="hljs-string">"derive"</span>] }
<span class="hljs-attr">serde_json</span> = <span class="hljs-string">"1.0"</span>
<span class="hljs-attr">strsim</span> = <span class="hljs-string">"0.11"</span> <span class="hljs-comment"># The magic fuzzy matching crate</span>
<span class="hljs-attr">reqwest</span> = { version = <span class="hljs-string">"0.11"</span>, features = [<span class="hljs-string">"json"</span>, <span class="hljs-string">"blocking"</span>] }
<span class="hljs-attr">tokio</span> = { version = <span class="hljs-string">"1"</span>, features = [<span class="hljs-string">"full"</span>] }
</code></pre>
<h3 id="heading-3-the-logic-srcmainrshttpmainrs">3. The Logic (<code>src/</code><a target="_blank" href="http://main.rs"><code>main.rs</code></a>)</h3>
<p>This is the core engine. It employs a "Waterfall" matching strategy:</p>
<ol>
<li><p><strong>Ticker Map</strong>: O(1) Lookup (e.g., "MSFT" -&gt; "Microsoft").</p>
</li>
<li><p><strong>Jaro-Winkler</strong>: Measures string similarity. If the score &gt; 0.75, we consider it a match.</p>
</li>
</ol>
<pre><code class="lang-rust"><span class="hljs-keyword">use</span> actix_cors::Cors;
<span class="hljs-keyword">use</span> actix_web::{web, App, HttpResponse, HttpServer, Responder};
<span class="hljs-keyword">use</span> serde::{Deserialize, Serialize};
<span class="hljs-keyword">use</span> std::collections::HashMap;
<span class="hljs-keyword">use</span> std::sync::Mutex;
<span class="hljs-keyword">use</span> strsim::jaro_winkler;

<span class="hljs-comment">// --- Data Structures ---</span>
<span class="hljs-meta">#[derive(Clone, Serialize, Deserialize, Debug)]</span>
<span class="hljs-class"><span class="hljs-keyword">struct</span> <span class="hljs-title">CompanyReference</span></span> {
    title: <span class="hljs-built_in">String</span>,
    ticker: <span class="hljs-built_in">String</span>,
}

<span class="hljs-meta">#[derive(Deserialize, Debug)]</span>
<span class="hljs-class"><span class="hljs-keyword">struct</span> <span class="hljs-title">SecCompany</span></span> {
    ticker: <span class="hljs-built_in">String</span>,
    title: <span class="hljs-built_in">String</span>,
}

<span class="hljs-meta">#[derive(Serialize, Debug)]</span>
<span class="hljs-class"><span class="hljs-keyword">struct</span> <span class="hljs-title">MatchResult</span></span> {
    input: <span class="hljs-built_in">String</span>,
    standardized_name: <span class="hljs-built_in">Option</span>&lt;<span class="hljs-built_in">String</span>&gt;,
    ticker: <span class="hljs-built_in">Option</span>&lt;<span class="hljs-built_in">String</span>&gt;,
    score: <span class="hljs-built_in">f64</span>,
    method: <span class="hljs-built_in">String</span>,
}

<span class="hljs-meta">#[derive(Deserialize)]</span>
<span class="hljs-class"><span class="hljs-keyword">struct</span> <span class="hljs-title">LookupRequest</span></span> { name: <span class="hljs-built_in">String</span> }

<span class="hljs-meta">#[derive(Deserialize)]</span>
<span class="hljs-class"><span class="hljs-keyword">struct</span> <span class="hljs-title">BatchRequest</span></span> { names: <span class="hljs-built_in">Vec</span>&lt;<span class="hljs-built_in">String</span>&gt; }

<span class="hljs-comment">// --- The Engine ---</span>
<span class="hljs-class"><span class="hljs-keyword">struct</span> <span class="hljs-title">StandardizationEngine</span></span> {
    master_list: <span class="hljs-built_in">Vec</span>&lt;CompanyReference&gt;,
    ticker_map: HashMap&lt;<span class="hljs-built_in">String</span>, <span class="hljs-built_in">String</span>&gt;,
}

<span class="hljs-keyword">impl</span> StandardizationEngine {
    <span class="hljs-function"><span class="hljs-keyword">fn</span> <span class="hljs-title">new</span></span>(companies: <span class="hljs-built_in">Vec</span>&lt;CompanyReference&gt;) -&gt; <span class="hljs-keyword">Self</span> {
        <span class="hljs-keyword">let</span> <span class="hljs-keyword">mut</span> ticker_map = HashMap::new();
        <span class="hljs-keyword">for</span> company <span class="hljs-keyword">in</span> &amp;companies {
            ticker_map.insert(company.ticker.clone(), company.title.clone());
        }
        StandardizationEngine { master_list: companies, ticker_map }
    }

    <span class="hljs-function"><span class="hljs-keyword">fn</span> <span class="hljs-title">search</span></span>(&amp;<span class="hljs-keyword">self</span>, query: &amp;<span class="hljs-built_in">str</span>) -&gt; MatchResult {
        <span class="hljs-keyword">let</span> clean_query = query.trim();
        <span class="hljs-keyword">let</span> query_upper = clean_query.to_uppercase();

        <span class="hljs-comment">// 1. Ticker Lookup (Exact &amp; Fast)</span>
        <span class="hljs-keyword">if</span> <span class="hljs-keyword">let</span> <span class="hljs-literal">Some</span>(name) = <span class="hljs-keyword">self</span>.ticker_map.get(&amp;query_upper) {
            <span class="hljs-keyword">return</span> MatchResult {
                input: query.to_string(),
                standardized_name: <span class="hljs-literal">Some</span>(name.clone()),
                ticker: <span class="hljs-literal">Some</span>(query_upper),
                score: <span class="hljs-number">1.0</span>,
                method: <span class="hljs-string">"Ticker Map"</span>.to_string(),
            };
        }

        <span class="hljs-comment">// 2. Fuzzy Search (Jaro-Winkler)</span>
        <span class="hljs-keyword">let</span> <span class="hljs-keyword">mut</span> best_match: <span class="hljs-built_in">Option</span>&lt;CompanyReference&gt; = <span class="hljs-literal">None</span>;
        <span class="hljs-keyword">let</span> <span class="hljs-keyword">mut</span> best_score = <span class="hljs-number">0.0</span>;

        <span class="hljs-keyword">for</span> company <span class="hljs-keyword">in</span> &amp;<span class="hljs-keyword">self</span>.master_list {
            <span class="hljs-keyword">let</span> score = jaro_winkler(&amp;query_upper, &amp;company.title.to_uppercase());
            <span class="hljs-keyword">if</span> score &gt; best_score {
                best_score = score;
                best_match = <span class="hljs-literal">Some</span>(company.clone());
            }
        }

        <span class="hljs-keyword">let</span> method = <span class="hljs-keyword">if</span> best_score == <span class="hljs-number">1.0</span> { <span class="hljs-string">"Exact Match"</span> } <span class="hljs-keyword">else</span> { <span class="hljs-string">"Jaro-Winkler Fuzzy"</span> };

        <span class="hljs-comment">// Threshold &gt; 0.75 usually indicates a good match for company names</span>
        <span class="hljs-keyword">if</span> best_score &gt; <span class="hljs-number">0.75</span> {
            <span class="hljs-keyword">let</span> m = best_match.unwrap();
            MatchResult {
                input: query.to_string(),
                standardized_name: <span class="hljs-literal">Some</span>(m.title),
                ticker: <span class="hljs-literal">Some</span>(m.ticker),
                score: best_score,
                method: method.to_string(),
            }
        } <span class="hljs-keyword">else</span> {
            MatchResult {
                input: query.to_string(),
                standardized_name: <span class="hljs-literal">None</span>,
                ticker: <span class="hljs-literal">None</span>,
                score: best_score,
                method: <span class="hljs-string">"No Match"</span>.to_string(),
            }
        }
    }
}

<span class="hljs-class"><span class="hljs-keyword">struct</span> <span class="hljs-title">AppState</span></span> {
    engine: Mutex&lt;StandardizationEngine&gt;,
}

<span class="hljs-comment">// --- API Handlers ---</span>
<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">fn</span> <span class="hljs-title">standardize_one</span></span>(data: web::Data&lt;AppState&gt;, req: web::Json&lt;LookupRequest&gt;) -&gt; <span class="hljs-keyword">impl</span> Responder {
    <span class="hljs-keyword">let</span> engine = data.engine.lock().unwrap();
    <span class="hljs-keyword">let</span> result = engine.search(&amp;req.name);
    HttpResponse::<span class="hljs-literal">Ok</span>().json(result)
}

<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">fn</span> <span class="hljs-title">standardize_batch</span></span>(data: web::Data&lt;AppState&gt;, req: web::Json&lt;BatchRequest&gt;) -&gt; <span class="hljs-keyword">impl</span> Responder {
    <span class="hljs-keyword">let</span> engine = data.engine.lock().unwrap();
    <span class="hljs-keyword">let</span> results: <span class="hljs-built_in">Vec</span>&lt;MatchResult&gt; = req.names.iter().map(|name| engine.search(name)).collect();
    HttpResponse::<span class="hljs-literal">Ok</span>().json(results)
}

<span class="hljs-comment">// --- Data Loader ---</span>
<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">fn</span> <span class="hljs-title">fetch_sec_data</span></span>() -&gt; <span class="hljs-built_in">Vec</span>&lt;CompanyReference&gt; {
    <span class="hljs-built_in">println!</span>(<span class="hljs-string">"Fetching live data from SEC.gov..."</span>);
    <span class="hljs-keyword">let</span> client = reqwest::Client::new();
    <span class="hljs-comment">// SEC requires a user-agent</span>
    <span class="hljs-keyword">let</span> resp = client.get(<span class="hljs-string">"https://www.sec.gov/files/company_tickers.json"</span>)
        .header(<span class="hljs-string">"User-Agent"</span>, <span class="hljs-string">"TestApp (test@example.com)"</span>) 
        .send().<span class="hljs-keyword">await</span>;

    <span class="hljs-keyword">if</span> <span class="hljs-keyword">let</span> <span class="hljs-literal">Ok</span>(response) = resp {
        <span class="hljs-keyword">if</span> <span class="hljs-keyword">let</span> <span class="hljs-literal">Ok</span>(map) = response.json::&lt;HashMap&lt;<span class="hljs-built_in">String</span>, SecCompany&gt;&gt;().<span class="hljs-keyword">await</span> {
            <span class="hljs-keyword">let</span> <span class="hljs-keyword">mut</span> companies = <span class="hljs-built_in">Vec</span>::new();
            <span class="hljs-keyword">for</span> (_, val) <span class="hljs-keyword">in</span> map {
                companies.push(CompanyReference { title: val.title, ticker: val.ticker });
            }
            <span class="hljs-built_in">println!</span>(<span class="hljs-string">"Loaded {} companies."</span>, companies.len());
            <span class="hljs-keyword">return</span> companies;
        }
    }
    <span class="hljs-comment">// Fallback data if API fails</span>
    <span class="hljs-built_in">vec!</span>[CompanyReference { title: <span class="hljs-string">"Apple Inc."</span>.into(), ticker: <span class="hljs-string">"AAPL"</span>.into() }]
}

<span class="hljs-meta">#[actix_web::main]</span>
<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">fn</span> <span class="hljs-title">main</span></span>() -&gt; std::io::<span class="hljs-built_in">Result</span>&lt;()&gt; {
    <span class="hljs-keyword">let</span> companies = fetch_sec_data().<span class="hljs-keyword">await</span>;
    <span class="hljs-keyword">let</span> engine = StandardizationEngine::new(companies);
    <span class="hljs-keyword">let</span> app_state = web::Data::new(AppState { engine: Mutex::new(engine) });

    <span class="hljs-built_in">println!</span>(<span class="hljs-string">"Server running at http://127.0.0.1:8080"</span>);

    HttpServer::new(<span class="hljs-keyword">move</span> || {
        App::new()
            .wrap(Cors::permissive())
            .app_data(app_state.clone())
            .route(<span class="hljs-string">"/api/standardize"</span>, web::post().to(standardize_one))
            .route(<span class="hljs-string">"/api/batch-standardize"</span>, web::post().to(standardize_batch))
    })
    .bind((<span class="hljs-string">"127.0.0.1"</span>, <span class="hljs-number">8080</span>))?
    .run()
    .<span class="hljs-keyword">await</span>
}
</code></pre>
<hr />
<h2 id="heading-part-2-the-react-frontend">🎨 Part 2: The React Frontend</h2>
<p>We use <strong>Vite</strong> for a fast setup. The frontend features a beautiful card-based layout, a Dark Mode toggle, and CSV export functionality.</p>
<h3 id="heading-1-setup">1. Setup</h3>
<pre><code class="lang-bash"><span class="hljs-built_in">cd</span> .. <span class="hljs-comment"># go to the root directory</span>
npm create vite@latest frontend -- --template react
<span class="hljs-built_in">cd</span> frontend
npm install axios
</code></pre>
<h3 id="heading-2-styling-appcss">2. Styling (<code>App.css</code>)</h3>
<p>This CSS uses variables for instant theme switching.</p>
<pre><code class="lang-css"><span class="hljs-selector-pseudo">:root</span> {
  <span class="hljs-comment">/* Light Theme Variables */</span>
  <span class="hljs-attribute">--bg-color</span>: <span class="hljs-number">#f8f9fa</span>;
  <span class="hljs-attribute">--card-bg</span>: <span class="hljs-number">#ffffff</span>;
  <span class="hljs-attribute">--text-main</span>: <span class="hljs-number">#2c3e50</span>;
  <span class="hljs-attribute">--text-secondary</span>: <span class="hljs-number">#5f6368</span>;
  <span class="hljs-attribute">--accent-color</span>: <span class="hljs-number">#2563eb</span>;
  <span class="hljs-attribute">--accent-hover</span>: <span class="hljs-number">#1d4ed8</span>;
  <span class="hljs-attribute">--border-color</span>: <span class="hljs-number">#e2e8f0</span>;
  <span class="hljs-attribute">--input-bg</span>: <span class="hljs-number">#ffffff</span>;
  <span class="hljs-attribute">--table-header</span>: <span class="hljs-number">#f1f5f9</span>;
  <span class="hljs-attribute">--table-hover</span>: <span class="hljs-number">#f8fafc</span>;
  <span class="hljs-attribute">--shadow</span>: <span class="hljs-number">0</span> <span class="hljs-number">4px</span> <span class="hljs-number">6px</span> -<span class="hljs-number">1px</span> <span class="hljs-built_in">rgba</span>(<span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0.1</span>), <span class="hljs-number">0</span> <span class="hljs-number">2px</span> <span class="hljs-number">4px</span> -<span class="hljs-number">1px</span> <span class="hljs-built_in">rgba</span>(<span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0.06</span>);
}

<span class="hljs-selector-attr">[data-theme=<span class="hljs-string">'dark'</span>]</span> {
  <span class="hljs-comment">/* Dark Theme Variables */</span>
  <span class="hljs-attribute">--bg-color</span>: <span class="hljs-number">#0f172a</span>;
  <span class="hljs-attribute">--card-bg</span>: <span class="hljs-number">#1e293b</span>;
  <span class="hljs-attribute">--text-main</span>: <span class="hljs-number">#f1f5f9</span>;
  <span class="hljs-attribute">--text-secondary</span>: <span class="hljs-number">#94a3b8</span>;
  <span class="hljs-attribute">--accent-color</span>: <span class="hljs-number">#3b82f6</span>;
  <span class="hljs-attribute">--accent-hover</span>: <span class="hljs-number">#60a5fa</span>;
  <span class="hljs-attribute">--border-color</span>: <span class="hljs-number">#334155</span>;
  <span class="hljs-attribute">--input-bg</span>: <span class="hljs-number">#0f172a</span>;
  <span class="hljs-attribute">--table-header</span>: <span class="hljs-number">#334155</span>;
  <span class="hljs-attribute">--table-hover</span>: <span class="hljs-number">#1e293b</span>; <span class="hljs-comment">/* Slightly lighter than card bg for hover effect if needed */</span>
  <span class="hljs-attribute">--shadow</span>: <span class="hljs-number">0</span> <span class="hljs-number">4px</span> <span class="hljs-number">6px</span> -<span class="hljs-number">1px</span> <span class="hljs-built_in">rgba</span>(<span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0.5</span>);
}

<span class="hljs-selector-tag">body</span> {
  <span class="hljs-attribute">margin</span>: <span class="hljs-number">0</span>;
  <span class="hljs-attribute">background-color</span>: <span class="hljs-built_in">var</span>(--bg-color);
  <span class="hljs-attribute">color</span>: <span class="hljs-built_in">var</span>(--text-main);
  <span class="hljs-attribute">font-family</span>: <span class="hljs-string">'Inter'</span>, system-ui, -apple-system, sans-serif;
  <span class="hljs-attribute">transition</span>: background-color <span class="hljs-number">0.3s</span>, color <span class="hljs-number">0.3s</span>;
  <span class="hljs-attribute">min-height</span>: <span class="hljs-number">100vh</span>;
  <span class="hljs-attribute">display</span>: flex;
  <span class="hljs-attribute">justify-content</span>: center;
}

<span class="hljs-selector-id">#root</span> {
  <span class="hljs-attribute">width</span>: <span class="hljs-number">100%</span>;
  <span class="hljs-attribute">max-width</span>: <span class="hljs-number">1000px</span>;
  <span class="hljs-attribute">padding</span>: <span class="hljs-number">2rem</span>;
  <span class="hljs-attribute">text-align</span>: center;
}

<span class="hljs-comment">/* Header &amp; Toggle */</span>
<span class="hljs-selector-tag">header</span> {
  <span class="hljs-attribute">display</span>: flex;
  <span class="hljs-attribute">justify-content</span>: space-between;
  <span class="hljs-attribute">align-items</span>: center;
  <span class="hljs-attribute">margin-bottom</span>: <span class="hljs-number">2rem</span>;
  <span class="hljs-attribute">padding-bottom</span>: <span class="hljs-number">1rem</span>;
  <span class="hljs-attribute">border-bottom</span>: <span class="hljs-number">1px</span> solid <span class="hljs-built_in">var</span>(--border-color);
}

<span class="hljs-selector-tag">h1</span> {
  <span class="hljs-attribute">font-size</span>: <span class="hljs-number">2rem</span>;
  <span class="hljs-attribute">margin</span>: <span class="hljs-number">0</span>;
  <span class="hljs-attribute">background</span>: <span class="hljs-built_in">linear-gradient</span>(<span class="hljs-number">90deg</span>, var(--accent-color), <span class="hljs-number">#8b5cf6</span>);
  <span class="hljs-attribute">-webkit-background-clip</span>: text;
  <span class="hljs-attribute">-webkit-text-fill-color</span>: transparent;
}

<span class="hljs-selector-class">.theme-toggle</span> {
  <span class="hljs-attribute">background</span>: none;
  <span class="hljs-attribute">border</span>: <span class="hljs-number">1px</span> solid <span class="hljs-built_in">var</span>(--border-color);
  <span class="hljs-attribute">color</span>: <span class="hljs-built_in">var</span>(--text-main);
  <span class="hljs-attribute">font-size</span>: <span class="hljs-number">1.2rem</span>;
  <span class="hljs-attribute">padding</span>: <span class="hljs-number">0.5rem</span>;
  <span class="hljs-attribute">border-radius</span>: <span class="hljs-number">50%</span>;
  <span class="hljs-attribute">cursor</span>: pointer;
  <span class="hljs-attribute">transition</span>: all <span class="hljs-number">0.2s</span>;
  <span class="hljs-attribute">width</span>: <span class="hljs-number">40px</span>;
  <span class="hljs-attribute">height</span>: <span class="hljs-number">40px</span>;
  <span class="hljs-attribute">display</span>: flex;
  <span class="hljs-attribute">align-items</span>: center;
  <span class="hljs-attribute">justify-content</span>: center;
}
<span class="hljs-selector-class">.theme-toggle</span><span class="hljs-selector-pseudo">:hover</span> {
  <span class="hljs-attribute">background-color</span>: <span class="hljs-built_in">var</span>(--border-color);
}

<span class="hljs-comment">/* Cards */</span>
<span class="hljs-selector-class">.card</span> {
  <span class="hljs-attribute">background-color</span>: <span class="hljs-built_in">var</span>(--card-bg);
  <span class="hljs-attribute">padding</span>: <span class="hljs-number">2rem</span>;
  <span class="hljs-attribute">border-radius</span>: <span class="hljs-number">12px</span>;
  <span class="hljs-attribute">box-shadow</span>: <span class="hljs-built_in">var</span>(--shadow);
  <span class="hljs-attribute">border</span>: <span class="hljs-number">1px</span> solid <span class="hljs-built_in">var</span>(--border-color);
  <span class="hljs-attribute">margin-bottom</span>: <span class="hljs-number">2rem</span>;
  <span class="hljs-attribute">transition</span>: background-color <span class="hljs-number">0.3s</span>;
}

<span class="hljs-selector-tag">h2</span> {
  <span class="hljs-attribute">margin-top</span>: <span class="hljs-number">0</span>;
  <span class="hljs-attribute">color</span>: <span class="hljs-built_in">var</span>(--text-main);
  <span class="hljs-attribute">font-size</span>: <span class="hljs-number">1.25rem</span>;
  <span class="hljs-attribute">margin-bottom</span>: <span class="hljs-number">1.5rem</span>;
  <span class="hljs-attribute">text-align</span>: left;
}

<span class="hljs-comment">/* Inputs &amp; Buttons */</span>
<span class="hljs-selector-class">.search-box</span> {
  <span class="hljs-attribute">display</span>: flex;
  <span class="hljs-attribute">gap</span>: <span class="hljs-number">12px</span>;
}

<span class="hljs-selector-tag">input</span><span class="hljs-selector-attr">[type=<span class="hljs-string">"text"</span>]</span> {
  <span class="hljs-attribute">flex</span>: <span class="hljs-number">1</span>;
  <span class="hljs-attribute">padding</span>: <span class="hljs-number">12px</span> <span class="hljs-number">16px</span>;
  <span class="hljs-attribute">border-radius</span>: <span class="hljs-number">8px</span>;
  <span class="hljs-attribute">border</span>: <span class="hljs-number">1px</span> solid <span class="hljs-built_in">var</span>(--border-color);
  <span class="hljs-attribute">background-color</span>: <span class="hljs-built_in">var</span>(--input-bg);
  <span class="hljs-attribute">color</span>: <span class="hljs-built_in">var</span>(--text-main);
  <span class="hljs-attribute">font-size</span>: <span class="hljs-number">1rem</span>;
  <span class="hljs-attribute">outline</span>: none;
  <span class="hljs-attribute">transition</span>: border-color <span class="hljs-number">0.2s</span>;
}

<span class="hljs-selector-tag">input</span><span class="hljs-selector-attr">[type=<span class="hljs-string">"text"</span>]</span><span class="hljs-selector-pseudo">:focus</span> {
  <span class="hljs-attribute">border-color</span>: <span class="hljs-built_in">var</span>(--accent-color);
  <span class="hljs-attribute">box-shadow</span>: <span class="hljs-number">0</span> <span class="hljs-number">0</span> <span class="hljs-number">0</span> <span class="hljs-number">2px</span> <span class="hljs-built_in">rgba</span>(<span class="hljs-number">37</span>, <span class="hljs-number">99</span>, <span class="hljs-number">235</span>, <span class="hljs-number">0.2</span>);
}

<span class="hljs-selector-tag">button</span><span class="hljs-selector-class">.primary-btn</span> {
  <span class="hljs-attribute">background-color</span>: <span class="hljs-built_in">var</span>(--accent-color);
  <span class="hljs-attribute">color</span>: white;
  <span class="hljs-attribute">border</span>: none;
  <span class="hljs-attribute">padding</span>: <span class="hljs-number">12px</span> <span class="hljs-number">24px</span>;
  <span class="hljs-attribute">border-radius</span>: <span class="hljs-number">8px</span>;
  <span class="hljs-attribute">font-weight</span>: <span class="hljs-number">600</span>;
  <span class="hljs-attribute">cursor</span>: pointer;
  <span class="hljs-attribute">transition</span>: background-color <span class="hljs-number">0.2s</span>;
}

<span class="hljs-selector-tag">button</span><span class="hljs-selector-class">.primary-btn</span><span class="hljs-selector-pseudo">:hover</span> {
  <span class="hljs-attribute">background-color</span>: <span class="hljs-built_in">var</span>(--accent-hover);
}

<span class="hljs-selector-tag">button</span><span class="hljs-selector-pseudo">:disabled</span> {
  <span class="hljs-attribute">opacity</span>: <span class="hljs-number">0.7</span>;
  <span class="hljs-attribute">cursor</span>: not-allowed;
}

<span class="hljs-comment">/* File Upload */</span>
<span class="hljs-selector-class">.file-upload-label</span> {
  <span class="hljs-attribute">display</span>: inline-block;
  <span class="hljs-attribute">padding</span>: <span class="hljs-number">12px</span> <span class="hljs-number">24px</span>;
  <span class="hljs-attribute">border</span>: <span class="hljs-number">2px</span> dashed <span class="hljs-built_in">var</span>(--border-color);
  <span class="hljs-attribute">border-radius</span>: <span class="hljs-number">8px</span>;
  <span class="hljs-attribute">cursor</span>: pointer;
  <span class="hljs-attribute">color</span>: <span class="hljs-built_in">var</span>(--text-secondary);
  <span class="hljs-attribute">font-weight</span>: <span class="hljs-number">500</span>;
  <span class="hljs-attribute">width</span>: <span class="hljs-number">100%</span>;
  <span class="hljs-attribute">box-sizing</span>: border-box;
  <span class="hljs-attribute">transition</span>: all <span class="hljs-number">0.2s</span>;
}

<span class="hljs-selector-class">.file-upload-label</span><span class="hljs-selector-pseudo">:hover</span> {
  <span class="hljs-attribute">border-color</span>: <span class="hljs-built_in">var</span>(--accent-color);
  <span class="hljs-attribute">color</span>: <span class="hljs-built_in">var</span>(--accent-color);
  <span class="hljs-attribute">background-color</span>: <span class="hljs-built_in">rgba</span>(<span class="hljs-number">37</span>, <span class="hljs-number">99</span>, <span class="hljs-number">235</span>, <span class="hljs-number">0.05</span>);
}

<span class="hljs-comment">/* Results Grid (Single) */</span>
<span class="hljs-selector-class">.result-grid</span> {
  <span class="hljs-attribute">display</span>: grid;
  <span class="hljs-attribute">grid-template-columns</span>: <span class="hljs-built_in">repeat</span>(auto-fit, minmax(<span class="hljs-number">200px</span>, <span class="hljs-number">1</span>fr));
  <span class="hljs-attribute">gap</span>: <span class="hljs-number">1.5rem</span>;
  <span class="hljs-attribute">margin-top</span>: <span class="hljs-number">1.5rem</span>;
  <span class="hljs-attribute">text-align</span>: left;
  <span class="hljs-attribute">padding-top</span>: <span class="hljs-number">1.5rem</span>;
  <span class="hljs-attribute">border-top</span>: <span class="hljs-number">1px</span> solid <span class="hljs-built_in">var</span>(--border-color);
}

<span class="hljs-selector-class">.result-item</span> <span class="hljs-selector-tag">label</span> {
  <span class="hljs-attribute">display</span>: block;
  <span class="hljs-attribute">font-size</span>: <span class="hljs-number">0.85rem</span>;
  <span class="hljs-attribute">color</span>: <span class="hljs-built_in">var</span>(--text-secondary);
  <span class="hljs-attribute">margin-bottom</span>: <span class="hljs-number">0.25rem</span>;
  <span class="hljs-attribute">text-transform</span>: uppercase;
  <span class="hljs-attribute">letter-spacing</span>: <span class="hljs-number">0.05em</span>;
}

<span class="hljs-selector-class">.result-item</span> <span class="hljs-selector-class">.value</span> {
  <span class="hljs-attribute">font-size</span>: <span class="hljs-number">1.1rem</span>;
  <span class="hljs-attribute">font-weight</span>: <span class="hljs-number">600</span>;
  <span class="hljs-attribute">color</span>: <span class="hljs-built_in">var</span>(--text-main);
}

<span class="hljs-selector-class">.ticker-badge</span> {
  <span class="hljs-attribute">background-color</span>: <span class="hljs-built_in">var</span>(--border-color);
  <span class="hljs-attribute">color</span>: <span class="hljs-built_in">var</span>(--text-main);
  <span class="hljs-attribute">padding</span>: <span class="hljs-number">2px</span> <span class="hljs-number">6px</span>;
  <span class="hljs-attribute">border-radius</span>: <span class="hljs-number">4px</span>;
  <span class="hljs-attribute">font-size</span>: <span class="hljs-number">0.8rem</span>;
  <span class="hljs-attribute">margin-left</span>: <span class="hljs-number">8px</span>;
  <span class="hljs-attribute">vertical-align</span>: middle;
}

<span class="hljs-comment">/* Table */</span>
<span class="hljs-selector-class">.table-container</span> {
  <span class="hljs-attribute">overflow-x</span>: auto;
  <span class="hljs-attribute">border-radius</span>: <span class="hljs-number">8px</span>;
  <span class="hljs-attribute">border</span>: <span class="hljs-number">1px</span> solid <span class="hljs-built_in">var</span>(--border-color);
}

<span class="hljs-selector-tag">table</span> {
  <span class="hljs-attribute">width</span>: <span class="hljs-number">100%</span>;
  <span class="hljs-attribute">border-collapse</span>: collapse;
}

<span class="hljs-selector-tag">th</span> {
  <span class="hljs-attribute">background-color</span>: <span class="hljs-built_in">var</span>(--table-header);
  <span class="hljs-attribute">color</span>: <span class="hljs-built_in">var</span>(--text-secondary);
  <span class="hljs-attribute">font-weight</span>: <span class="hljs-number">600</span>;
  <span class="hljs-attribute">padding</span>: <span class="hljs-number">12px</span> <span class="hljs-number">16px</span>;
  <span class="hljs-attribute">text-align</span>: left;
  <span class="hljs-attribute">font-size</span>: <span class="hljs-number">0.9rem</span>;
}

<span class="hljs-selector-tag">td</span> {
  <span class="hljs-attribute">padding</span>: <span class="hljs-number">12px</span> <span class="hljs-number">16px</span>;
  <span class="hljs-attribute">border-top</span>: <span class="hljs-number">1px</span> solid <span class="hljs-built_in">var</span>(--border-color);
  <span class="hljs-attribute">color</span>: <span class="hljs-built_in">var</span>(--text-main);
  <span class="hljs-attribute">text-align</span>: left;
}

<span class="hljs-selector-tag">tr</span><span class="hljs-selector-pseudo">:hover</span> <span class="hljs-selector-tag">td</span> {
  <span class="hljs-attribute">background-color</span>: <span class="hljs-built_in">var</span>(--table-hover); <span class="hljs-comment">/* Fixed hover visibility */</span>
}

<span class="hljs-comment">/* Utilities */</span>
<span class="hljs-selector-class">.status-badge</span> {
  <span class="hljs-attribute">padding</span>: <span class="hljs-number">4px</span> <span class="hljs-number">8px</span>;
  <span class="hljs-attribute">border-radius</span>: <span class="hljs-number">4px</span>;
  <span class="hljs-attribute">font-size</span>: <span class="hljs-number">0.85rem</span>;
  <span class="hljs-attribute">font-weight</span>: <span class="hljs-number">600</span>;
}
</code></pre>
<h3 id="heading-3-the-logic-appjsx">3. The Logic (<code>App.jsx</code>)</h3>
<p>Features:</p>
<ul>
<li><p><strong>Batch Processing</strong>: Upload a <code>.txt</code> file, get a table of results.</p>
</li>
<li><p><strong>CSV Download</strong>: Automatically generates a report of the standardization.</p>
</li>
<li><p><strong>Theme Toggle</strong>: Persists your choice in <code>localStorage</code>.</p>
</li>
</ul>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { useState, useEffect } <span class="hljs-keyword">from</span> <span class="hljs-string">'react'</span>
<span class="hljs-keyword">import</span> axios <span class="hljs-keyword">from</span> <span class="hljs-string">'axios'</span>
<span class="hljs-keyword">import</span> <span class="hljs-string">'./App.css'</span>

<span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">App</span>(<span class="hljs-params"></span>) </span>{
    <span class="hljs-keyword">const</span> [inputText, setInputText] = useState(<span class="hljs-string">''</span>)
    <span class="hljs-keyword">const</span> [result, setResult] = useState(<span class="hljs-literal">null</span>)
    <span class="hljs-keyword">const</span> [batchResults, setBatchResults] = useState([])
    <span class="hljs-keyword">const</span> [loading, setLoading] = useState(<span class="hljs-literal">false</span>)
    <span class="hljs-keyword">const</span> [theme, setTheme] = useState(<span class="hljs-string">'light'</span>)

    <span class="hljs-comment">// Theme Init</span>
    useEffect(<span class="hljs-function">() =&gt;</span> {
        <span class="hljs-keyword">const</span> savedTheme = <span class="hljs-built_in">localStorage</span>.getItem(<span class="hljs-string">'theme'</span>)
        <span class="hljs-keyword">const</span> prefersDark = <span class="hljs-built_in">window</span>.matchMedia(<span class="hljs-string">'(prefers-color-scheme: dark)'</span>).matches

        <span class="hljs-keyword">if</span> (savedTheme) {
            setTheme(savedTheme)
            <span class="hljs-built_in">document</span>.documentElement.setAttribute(<span class="hljs-string">'data-theme'</span>, savedTheme)
        } <span class="hljs-keyword">else</span> <span class="hljs-keyword">if</span> (prefersDark) {
            setTheme(<span class="hljs-string">'dark'</span>)
            <span class="hljs-built_in">document</span>.documentElement.setAttribute(<span class="hljs-string">'data-theme'</span>, <span class="hljs-string">'dark'</span>)
        }
    }, [])

    <span class="hljs-keyword">const</span> toggleTheme = <span class="hljs-function">() =&gt;</span> {
        <span class="hljs-keyword">const</span> newTheme = theme === <span class="hljs-string">'light'</span> ? <span class="hljs-string">'dark'</span> : <span class="hljs-string">'light'</span>
        setTheme(newTheme)
        <span class="hljs-built_in">document</span>.documentElement.setAttribute(<span class="hljs-string">'data-theme'</span>, newTheme)
        <span class="hljs-built_in">localStorage</span>.setItem(<span class="hljs-string">'theme'</span>, newTheme)
    }

    <span class="hljs-comment">// --- API Handlers ---</span>

    <span class="hljs-keyword">const</span> handleCheck = <span class="hljs-keyword">async</span> () =&gt; {
        <span class="hljs-keyword">if</span> (!inputText) <span class="hljs-keyword">return</span>
        setLoading(<span class="hljs-literal">true</span>)
        <span class="hljs-keyword">try</span> {
            <span class="hljs-keyword">const</span> res = <span class="hljs-keyword">await</span> axios.post(<span class="hljs-string">'http://127.0.0.1:8080/api/standardize'</span>, { <span class="hljs-attr">name</span>: inputText })
            setResult(res.data)
        } <span class="hljs-keyword">catch</span> (err) {
            <span class="hljs-built_in">console</span>.error(err)
            alert(<span class="hljs-string">"Error connecting to backend"</span>)
        }
        setLoading(<span class="hljs-literal">false</span>)
    }

    <span class="hljs-keyword">const</span> handleFileUpload = <span class="hljs-function">(<span class="hljs-params">e</span>) =&gt;</span> {
        <span class="hljs-keyword">const</span> file = e.target.files[<span class="hljs-number">0</span>]
        <span class="hljs-keyword">if</span> (!file) <span class="hljs-keyword">return</span>

        <span class="hljs-keyword">const</span> reader = <span class="hljs-keyword">new</span> FileReader()
        reader.onload = <span class="hljs-keyword">async</span> (event) =&gt; {
            <span class="hljs-keyword">const</span> text = event.target.result
            <span class="hljs-keyword">const</span> names = text.split(<span class="hljs-string">'\n'</span>).map(<span class="hljs-function"><span class="hljs-params">n</span> =&gt;</span> n.trim()).filter(<span class="hljs-function"><span class="hljs-params">n</span> =&gt;</span> n)

            setLoading(<span class="hljs-literal">true</span>)
            <span class="hljs-keyword">try</span> {
                <span class="hljs-keyword">const</span> res = <span class="hljs-keyword">await</span> axios.post(<span class="hljs-string">'http://127.0.0.1:8080/api/batch-standardize'</span>, { names })
                setBatchResults(res.data)
            } <span class="hljs-keyword">catch</span> (err) {
                <span class="hljs-built_in">console</span>.error(err)
                alert(<span class="hljs-string">"Error processing batch"</span>)
            }
            setLoading(<span class="hljs-literal">false</span>)
        }
        reader.readAsText(file)
    }

    <span class="hljs-comment">// --- CSV Download Logic ---</span>

    <span class="hljs-keyword">const</span> downloadCSV = <span class="hljs-function">() =&gt;</span> {
        <span class="hljs-keyword">if</span> (batchResults.length === <span class="hljs-number">0</span>) <span class="hljs-keyword">return</span>

        <span class="hljs-comment">// 1. Define Headers</span>
        <span class="hljs-keyword">const</span> headers = [<span class="hljs-string">"Input Name"</span>, <span class="hljs-string">"Standardized Name"</span>, <span class="hljs-string">"Ticker"</span>, <span class="hljs-string">"Method"</span>, <span class="hljs-string">"Confidence Score"</span>]

        <span class="hljs-comment">// 2. Format Data Rows</span>
        <span class="hljs-keyword">const</span> csvRows = batchResults.map(<span class="hljs-function"><span class="hljs-params">item</span> =&gt;</span> {
            <span class="hljs-comment">// Escape quotes in data to prevent CSV breakage</span>
            <span class="hljs-keyword">const</span> safeInput = <span class="hljs-string">`"<span class="hljs-subst">${item.input.replace(<span class="hljs-regexp">/"/g</span>, <span class="hljs-string">'""'</span>)}</span>"`</span>
            <span class="hljs-keyword">const</span> safeName = <span class="hljs-string">`"<span class="hljs-subst">${(item.standardized_name || <span class="hljs-string">""</span>).replace(<span class="hljs-regexp">/"/g</span>, <span class="hljs-string">'""'</span>)}</span>"`</span>
            <span class="hljs-keyword">const</span> safeTicker = <span class="hljs-string">`"<span class="hljs-subst">${(item.ticker || <span class="hljs-string">""</span>).replace(<span class="hljs-regexp">/"/g</span>, <span class="hljs-string">'""'</span>)}</span>"`</span>
            <span class="hljs-keyword">const</span> safeMethod = <span class="hljs-string">`"<span class="hljs-subst">${item.method}</span>"`</span>
            <span class="hljs-keyword">const</span> safeScore = <span class="hljs-string">`"<span class="hljs-subst">${(item.score * <span class="hljs-number">100</span>).toFixed(<span class="hljs-number">1</span>)}</span>%"`</span>

            <span class="hljs-keyword">return</span> [safeInput, safeName, safeTicker, safeMethod, safeScore].join(<span class="hljs-string">","</span>)
        })

        <span class="hljs-comment">// 3. Combine and Trigger Download</span>
        <span class="hljs-keyword">const</span> csvContent = [headers.join(<span class="hljs-string">","</span>), ...csvRows].join(<span class="hljs-string">"\n"</span>)
        <span class="hljs-keyword">const</span> blob = <span class="hljs-keyword">new</span> Blob([csvContent], { <span class="hljs-attr">type</span>: <span class="hljs-string">'text/csv;charset=utf-8;'</span> })
        <span class="hljs-keyword">const</span> url = URL.createObjectURL(blob)

        <span class="hljs-keyword">const</span> link = <span class="hljs-built_in">document</span>.createElement(<span class="hljs-string">"a"</span>)
        link.href = url
        link.setAttribute(<span class="hljs-string">"download"</span>, <span class="hljs-string">"standardized_companies_report.csv"</span>)
        <span class="hljs-built_in">document</span>.body.appendChild(link)
        link.click()
        <span class="hljs-built_in">document</span>.body.removeChild(link)
    }

    <span class="hljs-comment">// --- Render Helpers ---</span>

    <span class="hljs-keyword">const</span> getScoreColor = <span class="hljs-function">(<span class="hljs-params">score</span>) =&gt;</span> {
        <span class="hljs-keyword">if</span> (score === <span class="hljs-number">1.0</span>) <span class="hljs-keyword">return</span> <span class="hljs-string">'#22c55e'</span>
        <span class="hljs-keyword">if</span> (score &gt; <span class="hljs-number">0.85</span>) <span class="hljs-keyword">return</span> <span class="hljs-string">'#3b82f6'</span>
        <span class="hljs-keyword">if</span> (score &gt; <span class="hljs-number">0.75</span>) <span class="hljs-keyword">return</span> <span class="hljs-string">'#f59e0b'</span>
        <span class="hljs-keyword">return</span> <span class="hljs-string">'#ef4444'</span>
    }

    <span class="hljs-keyword">return</span> (
        <span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"app-container"</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">header</span>&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">div</span>&gt;</span>
                    <span class="hljs-tag">&lt;<span class="hljs-name">h1</span>&gt;</span>Company Standardizer<span class="hljs-tag">&lt;/<span class="hljs-name">h1</span>&gt;</span>
                    <span class="hljs-tag">&lt;<span class="hljs-name">span</span> <span class="hljs-attr">style</span>=<span class="hljs-string">{{</span> <span class="hljs-attr">color:</span> '<span class="hljs-attr">var</span>(<span class="hljs-attr">--text-secondary</span>)', <span class="hljs-attr">fontSize:</span> '<span class="hljs-attr">0.9rem</span>' }}&gt;</span>
            Powered by Rust &amp; SEC API
          <span class="hljs-tag">&lt;/<span class="hljs-name">span</span>&gt;</span>
                <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">button</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"theme-toggle"</span> <span class="hljs-attr">onClick</span>=<span class="hljs-string">{toggleTheme}</span> <span class="hljs-attr">title</span>=<span class="hljs-string">"Toggle Theme"</span>&gt;</span>
                    {theme === 'light' ? '🌙' : '☀️'}
                <span class="hljs-tag">&lt;/<span class="hljs-name">button</span>&gt;</span>
            <span class="hljs-tag">&lt;/<span class="hljs-name">header</span>&gt;</span>

            {/* Single Lookup Card */}
            <span class="hljs-tag">&lt;<span class="hljs-name">section</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"card"</span>&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">h2</span>&gt;</span>Single Entity Lookup<span class="hljs-tag">&lt;/<span class="hljs-name">h2</span>&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"search-box"</span>&gt;</span>
                    <span class="hljs-tag">&lt;<span class="hljs-name">input</span>
                        <span class="hljs-attr">type</span>=<span class="hljs-string">"text"</span>
                        <span class="hljs-attr">value</span>=<span class="hljs-string">{inputText}</span>
                        <span class="hljs-attr">onChange</span>=<span class="hljs-string">{(e)</span> =&gt;</span> setInputText(e.target.value)}
                        placeholder="Enter Company Name or Ticker (e.g., Apple, NVDA)"
                        onKeyDown={(e) =&gt; e.key === 'Enter' &amp;&amp; handleCheck()}
                    /&gt;
                    <span class="hljs-tag">&lt;<span class="hljs-name">button</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"primary-btn"</span> <span class="hljs-attr">onClick</span>=<span class="hljs-string">{handleCheck}</span> <span class="hljs-attr">disabled</span>=<span class="hljs-string">{loading}</span>&gt;</span>
                        {loading ? 'Searching...' : 'Search'}
                    <span class="hljs-tag">&lt;/<span class="hljs-name">button</span>&gt;</span>
                <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>

                {result &amp;&amp; (
                    <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"result-grid"</span>&gt;</span>
                        <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"result-item"</span>&gt;</span>
                            <span class="hljs-tag">&lt;<span class="hljs-name">label</span>&gt;</span>Input<span class="hljs-tag">&lt;/<span class="hljs-name">label</span>&gt;</span>
                            <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"value"</span>&gt;</span>{result.input}<span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
                        <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>

                        <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"result-item"</span>&gt;</span>
                            <span class="hljs-tag">&lt;<span class="hljs-name">label</span>&gt;</span>Standardized Name<span class="hljs-tag">&lt;/<span class="hljs-name">label</span>&gt;</span>
                            <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"value"</span> <span class="hljs-attr">style</span>=<span class="hljs-string">{{</span> <span class="hljs-attr">display:</span> '<span class="hljs-attr">flex</span>', <span class="hljs-attr">flexDirection:</span> '<span class="hljs-attr">column</span>', <span class="hljs-attr">alignItems:</span> '<span class="hljs-attr">flex-start</span>', <span class="hljs-attr">gap:</span> '<span class="hljs-attr">5px</span>' }}&gt;</span>
                                {result.standardized_name ? (
                                    <span class="hljs-tag">&lt;&gt;</span>
                                        <span class="hljs-tag">&lt;<span class="hljs-name">span</span>&gt;</span>{result.standardized_name}<span class="hljs-tag">&lt;/<span class="hljs-name">span</span>&gt;</span>
                                        {result.ticker &amp;&amp; (
                                            <span class="hljs-tag">&lt;<span class="hljs-name">span</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"ticker-badge"</span> <span class="hljs-attr">title</span>=<span class="hljs-string">"Stock Ticker"</span>&gt;</span>
                        {result.ticker}
                      <span class="hljs-tag">&lt;/<span class="hljs-name">span</span>&gt;</span>
                                        )}
                                    <span class="hljs-tag">&lt;/&gt;</span>
                                ) : (
                                    <span class="hljs-tag">&lt;<span class="hljs-name">span</span> <span class="hljs-attr">style</span>=<span class="hljs-string">{{color:</span> '<span class="hljs-attr">var</span>(<span class="hljs-attr">--text-secondary</span>)'}}&gt;</span>No Match<span class="hljs-tag">&lt;/<span class="hljs-name">span</span>&gt;</span>
                                )}
                            <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
                        <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>

                        <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"result-item"</span>&gt;</span>
                            <span class="hljs-tag">&lt;<span class="hljs-name">label</span>&gt;</span>Method<span class="hljs-tag">&lt;/<span class="hljs-name">label</span>&gt;</span>
                            <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"value"</span> <span class="hljs-attr">style</span>=<span class="hljs-string">{{</span> <span class="hljs-attr">fontSize:</span> '<span class="hljs-attr">0.95rem</span>' }}&gt;</span>{result.method}<span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
                        <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>

                        <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"result-item"</span>&gt;</span>
                            <span class="hljs-tag">&lt;<span class="hljs-name">label</span>&gt;</span>Confidence<span class="hljs-tag">&lt;/<span class="hljs-name">label</span>&gt;</span>
                            <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"value"</span> <span class="hljs-attr">style</span>=<span class="hljs-string">{{</span> <span class="hljs-attr">color:</span> <span class="hljs-attr">getScoreColor</span>(<span class="hljs-attr">result.score</span>) }}&gt;</span>
                                {(result.score * 100).toFixed(1)}%
                            <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
                        <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
                    <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
                )}
            <span class="hljs-tag">&lt;/<span class="hljs-name">section</span>&gt;</span></span>

            {<span class="hljs-comment">/* Batch Processing Card */</span>}
            &lt;section className=<span class="hljs-string">"card"</span>&gt;
                <span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">style</span>=<span class="hljs-string">{{</span> <span class="hljs-attr">display:</span> '<span class="hljs-attr">flex</span>', <span class="hljs-attr">justifyContent:</span> '<span class="hljs-attr">space-between</span>', <span class="hljs-attr">alignItems:</span> '<span class="hljs-attr">center</span>', <span class="hljs-attr">marginBottom:</span> '<span class="hljs-attr">1.5rem</span>' }}&gt;</span>
                    <span class="hljs-tag">&lt;<span class="hljs-name">h2</span> <span class="hljs-attr">style</span>=<span class="hljs-string">{{</span> <span class="hljs-attr">marginBottom:</span> <span class="hljs-attr">0</span> }}&gt;</span>Batch Processing<span class="hljs-tag">&lt;/<span class="hljs-name">h2</span>&gt;</span>

                    {/* Download Button - Only shows when there are results */}
                    {batchResults.length &gt; 0 &amp;&amp; (
                        <span class="hljs-tag">&lt;<span class="hljs-name">button</span>
                            <span class="hljs-attr">onClick</span>=<span class="hljs-string">{downloadCSV}</span>
                            <span class="hljs-attr">style</span>=<span class="hljs-string">{{</span>
                                <span class="hljs-attr">backgroundColor:</span> '#<span class="hljs-attr">22c55e</span>', // <span class="hljs-attr">Green</span> <span class="hljs-attr">for</span> <span class="hljs-attr">Excel</span>/<span class="hljs-attr">CSV</span>
                                <span class="hljs-attr">color:</span> '<span class="hljs-attr">white</span>',
                                <span class="hljs-attr">border:</span> '<span class="hljs-attr">none</span>',
                                <span class="hljs-attr">padding:</span> '<span class="hljs-attr">8px</span> <span class="hljs-attr">16px</span>',
                                <span class="hljs-attr">borderRadius:</span> '<span class="hljs-attr">6px</span>',
                                <span class="hljs-attr">cursor:</span> '<span class="hljs-attr">pointer</span>',
                                <span class="hljs-attr">fontSize:</span> '<span class="hljs-attr">0.9rem</span>',
                                <span class="hljs-attr">fontWeight:</span> '<span class="hljs-attr">600</span>',
                                <span class="hljs-attr">boxShadow:</span> '<span class="hljs-attr">0</span> <span class="hljs-attr">2px</span> <span class="hljs-attr">4px</span> <span class="hljs-attr">rgba</span>(<span class="hljs-attr">0</span>,<span class="hljs-attr">0</span>,<span class="hljs-attr">0</span>,<span class="hljs-attr">0.1</span>)'
                            }}
                        &gt;</span>
                            ⬇ Download CSV
                        <span class="hljs-tag">&lt;/<span class="hljs-name">button</span>&gt;</span>
                    )}
                <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span></span>

                <span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">style</span>=<span class="hljs-string">{{</span> <span class="hljs-attr">marginBottom:</span> '<span class="hljs-attr">1.5rem</span>' }}&gt;</span>
                    <span class="hljs-tag">&lt;<span class="hljs-name">label</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"file-upload-label"</span>&gt;</span>
                        <span class="hljs-tag">&lt;<span class="hljs-name">input</span>
                            <span class="hljs-attr">type</span>=<span class="hljs-string">"file"</span>
                            <span class="hljs-attr">onChange</span>=<span class="hljs-string">{handleFileUpload}</span>
                            <span class="hljs-attr">accept</span>=<span class="hljs-string">".txt,.csv"</span>
                            <span class="hljs-attr">style</span>=<span class="hljs-string">{{</span> <span class="hljs-attr">display:</span> '<span class="hljs-attr">none</span>' }}
                        /&gt;</span>
                        {loading ? "Processing..." : "Click to Upload .txt or .csv List"}
                    <span class="hljs-tag">&lt;/<span class="hljs-name">label</span>&gt;</span>
                <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span></span>

                {loading &amp;&amp; batchResults.length === <span class="hljs-number">0</span> &amp;&amp; <span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">p</span> <span class="hljs-attr">style</span>=<span class="hljs-string">{{color:</span> '<span class="hljs-attr">var</span>(<span class="hljs-attr">--text-secondary</span>)'}}&gt;</span>Processing batch file...<span class="hljs-tag">&lt;/<span class="hljs-name">p</span>&gt;</span></span>}

                {batchResults.length &gt; <span class="hljs-number">0</span> &amp;&amp; (
                    <span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"table-container"</span>&gt;</span>
                        <span class="hljs-tag">&lt;<span class="hljs-name">table</span>&gt;</span>
                            <span class="hljs-tag">&lt;<span class="hljs-name">thead</span>&gt;</span>
                            <span class="hljs-tag">&lt;<span class="hljs-name">tr</span>&gt;</span>
                                <span class="hljs-tag">&lt;<span class="hljs-name">th</span>&gt;</span>Input<span class="hljs-tag">&lt;/<span class="hljs-name">th</span>&gt;</span>
                                <span class="hljs-tag">&lt;<span class="hljs-name">th</span>&gt;</span>Standardized Name<span class="hljs-tag">&lt;/<span class="hljs-name">th</span>&gt;</span>
                                <span class="hljs-tag">&lt;<span class="hljs-name">th</span>&gt;</span>Ticker<span class="hljs-tag">&lt;/<span class="hljs-name">th</span>&gt;</span>
                                <span class="hljs-tag">&lt;<span class="hljs-name">th</span>&gt;</span>Method<span class="hljs-tag">&lt;/<span class="hljs-name">th</span>&gt;</span>
                                <span class="hljs-tag">&lt;<span class="hljs-name">th</span>&gt;</span>Score<span class="hljs-tag">&lt;/<span class="hljs-name">th</span>&gt;</span>
                            <span class="hljs-tag">&lt;/<span class="hljs-name">tr</span>&gt;</span>
                            <span class="hljs-tag">&lt;/<span class="hljs-name">thead</span>&gt;</span>
                            <span class="hljs-tag">&lt;<span class="hljs-name">tbody</span>&gt;</span>
                            {batchResults.map((item, idx) =&gt; (
                                <span class="hljs-tag">&lt;<span class="hljs-name">tr</span> <span class="hljs-attr">key</span>=<span class="hljs-string">{idx}</span>&gt;</span>
                                    <span class="hljs-tag">&lt;<span class="hljs-name">td</span>&gt;</span>{item.input}<span class="hljs-tag">&lt;/<span class="hljs-name">td</span>&gt;</span>
                                    <span class="hljs-tag">&lt;<span class="hljs-name">td</span>&gt;</span>{item.standardized_name || "-"}<span class="hljs-tag">&lt;/<span class="hljs-name">td</span>&gt;</span>
                                    <span class="hljs-tag">&lt;<span class="hljs-name">td</span>&gt;</span>
                                        {item.ticker ? <span class="hljs-tag">&lt;<span class="hljs-name">span</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"ticker-badge"</span>&gt;</span>{item.ticker}<span class="hljs-tag">&lt;/<span class="hljs-name">span</span>&gt;</span> : <span class="hljs-tag">&lt;<span class="hljs-name">span</span> <span class="hljs-attr">style</span>=<span class="hljs-string">{{color:</span>'<span class="hljs-attr">var</span>(<span class="hljs-attr">--text-secondary</span>)'}}&gt;</span>-<span class="hljs-tag">&lt;/<span class="hljs-name">span</span>&gt;</span>}
                                    <span class="hljs-tag">&lt;/<span class="hljs-name">td</span>&gt;</span>
                                    <span class="hljs-tag">&lt;<span class="hljs-name">td</span>&gt;</span>{item.method}<span class="hljs-tag">&lt;/<span class="hljs-name">td</span>&gt;</span>
                                    <span class="hljs-tag">&lt;<span class="hljs-name">td</span>&gt;</span>
                      <span class="hljs-tag">&lt;<span class="hljs-name">span</span>
                          <span class="hljs-attr">className</span>=<span class="hljs-string">"status-badge"</span>
                          <span class="hljs-attr">style</span>=<span class="hljs-string">{{</span>
                              <span class="hljs-attr">backgroundColor:</span> `${<span class="hljs-attr">getScoreColor</span>(<span class="hljs-attr">item.score</span>)}<span class="hljs-attr">20</span>`,
                              <span class="hljs-attr">color:</span> <span class="hljs-attr">getScoreColor</span>(<span class="hljs-attr">item.score</span>)
                          }}
                      &gt;</span>
                        {(item.score * 100).toFixed(0)}%
                      <span class="hljs-tag">&lt;/<span class="hljs-name">span</span>&gt;</span>
                                    <span class="hljs-tag">&lt;/<span class="hljs-name">td</span>&gt;</span>
                                <span class="hljs-tag">&lt;/<span class="hljs-name">tr</span>&gt;</span>
                            ))}
                            <span class="hljs-tag">&lt;/<span class="hljs-name">tbody</span>&gt;</span>
                        <span class="hljs-tag">&lt;/<span class="hljs-name">table</span>&gt;</span>
                    <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span></span>
                )}
            &lt;/section&gt;
        &lt;/div&gt;
    )
}

<span class="hljs-keyword">export</span> <span class="hljs-keyword">default</span> App
</code></pre>
<hr />
<h2 id="heading-how-to-run">⚡ How to Run</h2>
<p>Follow these exact steps to run the application locally.</p>
<h3 id="heading-1-start-the-backend">1. Start the Backend</h3>
<p>Open a terminal in the <code>backend/</code> directory:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># This compiles the Rust code and starts the server</span>
cargo run
</code></pre>
<p><em>Wait until you see:</em> <code>Successfully loaded X companies... Server running at</code> <a target="_blank" href="http://127.0.0.1:8080"><code>http://127.0.0.1:8080</code></a>.</p>
<h3 id="heading-2-start-the-frontend">2. Start the Frontend</h3>
<p>Open a <strong>new</strong> terminal window in the <code>frontend/</code> directory:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Install dependencies (only needed the first time)</span>
npm install

<span class="hljs-comment"># Start the dev server</span>
npm run dev
</code></pre>
<p><em>Click the URL displayed (usually</em> <a target="_blank" href="http://localhost:5173"><code>http://localhost:5173</code></a>).</p>
<h3 id="heading-3-test-it-out">3. Test It Out</h3>
<ul>
<li><p><strong>Exact Match:</strong> Type <code>MSFT</code>. The system detects the ticker and returns <strong>Microsoft Corporation</strong>.</p>
</li>
<li><p><strong>Fuzzy Match:</strong> Type <code>Amzn</code>. The system uses Jaro-Winkler logic to match it to <a target="_blank" href="http://Amazon.com"><strong>Amazon.com</strong></a> <strong>Inc.</strong> with high confidence.</p>
</li>
<li><p><strong>Batch:</strong> Create a text file named <code>test.txt</code> with the content:</p>
<pre><code class="lang-plaintext">  Appl
  Teslaa
  JPMvc
</code></pre>
<p>  Upload it, view the table, and click <strong>Download CSV</strong> to get your report.</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1768213892738/88055094-b9c1-4f71-99d3-4791e8b9806b.png" alt class="image--center mx-auto" /></p>
<hr />
<h3 id="heading-conclusion">Conclusion</h3>
<p>By combining the performance of Rust with the interactivity of React, we've built a tool that solves a real Data Engineering problem. This architecture is easily extensible, you could swap the SEC API for an internal SQL database or add machine learning models for even smarter matching.</p>
]]></content:encoded></item><item><title><![CDATA[Breaking the Rust Loop: Building Your First Async Data Pipeline (for Data Engineers)]]></title><description><![CDATA[As a Data Engineer, I often found myself stuck in the "Rust Tutorial Loop." I’d learn the syntax, fight the borrow checker, and then forget it all because I didn't have a practical use case.
The breakthrough happened when I stopped trying to learn "R...]]></description><link>https://thedatacrab.com/breaking-the-rust-loop-building-your-first-async-data-pipeline-for-data-engineers</link><guid isPermaLink="true">https://thedatacrab.com/breaking-the-rust-loop-building-your-first-async-data-pipeline-for-data-engineers</guid><category><![CDATA[RustLang]]></category><category><![CDATA[data-engineering]]></category><category><![CDATA[asynchronous]]></category><category><![CDATA[Rust]]></category><category><![CDATA[ETL]]></category><dc:creator><![CDATA[The Data Crab]]></dc:creator><pubDate>Sat, 10 Jan 2026 13:07:18 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1768051093753/56a18b2e-41a0-4ba7-a71d-2ceff8bc478e.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>As a Data Engineer, I often found myself stuck in the "Rust Tutorial Loop." I’d learn the syntax, fight the borrow checker, and then forget it all because I didn't have a practical use case.</p>
<p>The breakthrough happened when I stopped trying to learn "Rust" and started trying to build "Data Engineering tools" <em>with</em> Rust.</p>
<p>In this post, we will build a classic <strong>Extract-Load (EL)</strong> pipeline. Instead of Python, Requests, and Pandas, we will use <strong>Rust, Reqwest, and SQLite</strong>. We will fetch real-time Bitcoin data from a public API, validate the schema strictly (something Python struggles with), and archive it into a database.</p>
<h2 id="heading-why-rust-for-data-engineering">Why Rust for Data Engineering?</h2>
<ul>
<li><p><strong>Schema Safety:</strong> If the API changes its data format, our code fails <em>before</em> processing, not after filling the database with garbage.</p>
</li>
<li><p><strong>Single Binary:</strong> No virtual environments or <code>pip install</code>. The final output is a single file you can drop onto a server or Lambda function.</p>
</li>
<li><p><strong>Performance:</strong> It handles high-throughput async tasks with a fraction of the memory Python requires.</p>
</li>
</ul>
<h2 id="heading-the-stack">The Stack</h2>
<p>We are using the "Big Four" crates (libraries) for Rust data ops:</p>
<ol>
<li><p><code>reqwest</code>: The HTTP client (Rust's <code>requests</code>).</p>
</li>
<li><p><code>tokio</code>: The async runtime (handles concurrency).</p>
</li>
<li><p><code>serde</code>: Serialization/Deserialization (turns JSON into Structs).</p>
</li>
<li><p><code>rusqlite</code>: Lightweight SQLite interaction.</p>
</li>
</ol>
<h2 id="heading-step-1-project-setup">Step 1: Project Setup</h2>
<p>First, initialize a new project and add the dependencies.</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Create the project</span>
cargo new crypto_ingestor
<span class="hljs-built_in">cd</span> crypto_ingestor

<span class="hljs-comment"># Add dependencies via CLI (modern Cargo)</span>
cargo add reqwest --features json
cargo add tokio --features full
cargo add rusqlite --features bundled
cargo add serde --features derive
</code></pre>
<h2 id="heading-step-2-define-the-data-contract">Step 2: Define the Data Contract</h2>
<p>In Python, we might just grab <code>data['bitcoin']['usd']</code>. In Rust, we define the structure upfront. This is our contract. If the API violates this, the pipeline stops immediately.</p>
<pre><code class="lang-rust"><span class="hljs-keyword">use</span> serde::Deserialize;

<span class="hljs-meta">#[derive(Debug, Deserialize)]</span>
<span class="hljs-class"><span class="hljs-keyword">struct</span> <span class="hljs-title">CoinGeckoResponse</span></span> {
    bitcoin: PriceData,
}

<span class="hljs-meta">#[derive(Debug, Deserialize)]</span>
<span class="hljs-class"><span class="hljs-keyword">struct</span> <span class="hljs-title">PriceData</span></span> {
    usd: <span class="hljs-built_in">f64</span>,
    usd_market_cap: <span class="hljs-built_in">f64</span>,
}
</code></pre>
<h2 id="heading-step-3-the-gotcha-handling-real-world-apis">Step 3: The "Gotcha" (Handling Real World APIs)</h2>
<p>Public APIs often block generic HTTP clients. During development, I ran into a <code>403 Forbidden</code> or Schema Error because the API rejected the default User-Agent.</p>
<p>Here is the robust, production-ready fetch function. Note how we handle errors: instead of a generic traceback, we log the raw response if parsing fails (acting like a mini Dead Letter Queue).</p>
<pre><code class="lang-rust"><span class="hljs-keyword">use</span> std::error::Error;

<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">fn</span> <span class="hljs-title">fetch_price</span></span>() -&gt; std::result::<span class="hljs-built_in">Result</span>&lt;CoinGeckoResponse, <span class="hljs-built_in">Box</span>&lt;<span class="hljs-keyword">dyn</span> Error&gt;&gt; {
    <span class="hljs-keyword">let</span> url = <span class="hljs-string">"https://api.coingecko.com/api/v3/simple/price?ids=bitcoin&amp;vs_currencies=usd&amp;include_market_cap=true"</span>;

    <span class="hljs-comment">// 1. Build a client to set headers (mimic a browser)</span>
    <span class="hljs-keyword">let</span> client = reqwest::Client::new();

    <span class="hljs-keyword">let</span> response = client
        .get(url)
        .header(<span class="hljs-string">"User-Agent"</span>, <span class="hljs-string">"MyCryptoIngestor/1.0"</span>) 
        .send()
        .<span class="hljs-keyword">await</span>?;

    <span class="hljs-comment">// 2. Check Status Code before parsing</span>
    <span class="hljs-keyword">if</span> !response.status().is_success() {
        <span class="hljs-keyword">return</span> <span class="hljs-literal">Err</span>(<span class="hljs-built_in">Box</span>::from(<span class="hljs-built_in">format!</span>(<span class="hljs-string">"API Request failed with status: {}"</span>, response.status())));
    }

    <span class="hljs-comment">// 3. Get raw text to debug schema failures</span>
    <span class="hljs-keyword">let</span> text_content = response.text().<span class="hljs-keyword">await</span>?;

    <span class="hljs-comment">// 4. Parse JSON</span>
    <span class="hljs-keyword">let</span> parsed_data: CoinGeckoResponse = <span class="hljs-keyword">match</span> serde_json::from_str(&amp;text_content) {
        <span class="hljs-literal">Ok</span>(data) =&gt; data,
        <span class="hljs-literal">Err</span>(e) =&gt; {
            eprintln!(<span class="hljs-string">"⚠️ RAW RESPONSE RECEIVED: {}"</span>, text_content);
            <span class="hljs-keyword">return</span> <span class="hljs-literal">Err</span>(<span class="hljs-built_in">Box</span>::new(e));
        }
    };

    <span class="hljs-literal">Ok</span>(parsed_data)
}
</code></pre>
<h2 id="heading-step-4-database-operations-amp-verification">Step 4: Database Operations &amp; Verification</h2>
<p>We need to create the table if it doesn't exist, insert the data, and then—to prove it worked—read it back within the same app.</p>
<pre><code class="lang-rust"><span class="hljs-keyword">use</span> rusqlite::{params, Connection, <span class="hljs-built_in">Result</span>};

<span class="hljs-function"><span class="hljs-keyword">fn</span> <span class="hljs-title">init_db</span></span>() -&gt; <span class="hljs-built_in">Result</span>&lt;Connection&gt; {
    <span class="hljs-keyword">let</span> conn = Connection::open(<span class="hljs-string">"prices.db"</span>)?;
    conn.execute(
        <span class="hljs-string">"CREATE TABLE IF NOT EXISTS bitcoin_prices (
            id INTEGER PRIMARY KEY,
            price_usd REAL,
            market_cap REAL,
            timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
        )"</span>,
        [],
    )?;
    <span class="hljs-literal">Ok</span>(conn)
}

<span class="hljs-function"><span class="hljs-keyword">fn</span> <span class="hljs-title">read_history</span></span>(conn: &amp;Connection) -&gt; <span class="hljs-built_in">Result</span>&lt;()&gt; {
    <span class="hljs-keyword">let</span> <span class="hljs-keyword">mut</span> stmt = conn.prepare(<span class="hljs-string">"SELECT id, price_usd, timestamp FROM bitcoin_prices ORDER BY id DESC LIMIT 5"</span>)?;

    <span class="hljs-comment">// Map rows to a tuple</span>
    <span class="hljs-keyword">let</span> rows = stmt.query_map([], |row| {
        <span class="hljs-literal">Ok</span>((
            row.get::&lt;_, <span class="hljs-built_in">i32</span>&gt;(<span class="hljs-number">0</span>)?,    <span class="hljs-comment">// id</span>
            row.get::&lt;_, <span class="hljs-built_in">f64</span>&gt;(<span class="hljs-number">1</span>)?,    <span class="hljs-comment">// price</span>
            row.get::&lt;_, <span class="hljs-built_in">String</span>&gt;(<span class="hljs-number">2</span>)?  <span class="hljs-comment">// timestamp</span>
        ))
    })?;

    <span class="hljs-built_in">println!</span>(<span class="hljs-string">"\n📊 Recent History (Last 5 records):"</span>);
    <span class="hljs-built_in">println!</span>(<span class="hljs-string">"-------------------------------------"</span>);

    <span class="hljs-keyword">for</span> row_result <span class="hljs-keyword">in</span> rows {
        <span class="hljs-keyword">let</span> (id, price, time) = row_result?;
        <span class="hljs-built_in">println!</span>(<span class="hljs-string">"[{}] {} | Price: ${:.2}"</span>, id, time, price);
    }

    <span class="hljs-literal">Ok</span>(())
}
</code></pre>
<h2 id="heading-step-5-the-full-pipeline-putting-it-together">Step 5: The Full Pipeline (Putting it together)</h2>
<p>Here is the complete <code>src/</code><a target="_blank" href="http://main.rs"><code>main.rs</code></a> file.</p>
<pre><code class="lang-rust"><span class="hljs-keyword">use</span> rusqlite::{params, Connection, <span class="hljs-built_in">Result</span>};
<span class="hljs-keyword">use</span> serde::Deserialize;
<span class="hljs-keyword">use</span> std::error::Error;

<span class="hljs-comment">// --- Data Models ---</span>
<span class="hljs-meta">#[derive(Debug, Deserialize)]</span>
<span class="hljs-class"><span class="hljs-keyword">struct</span> <span class="hljs-title">CoinGeckoResponse</span></span> {
    bitcoin: PriceData,
}

<span class="hljs-meta">#[derive(Debug, Deserialize)]</span>
<span class="hljs-class"><span class="hljs-keyword">struct</span> <span class="hljs-title">PriceData</span></span> {
    usd: <span class="hljs-built_in">f64</span>,
    usd_market_cap: <span class="hljs-built_in">f64</span>,
}

<span class="hljs-comment">// --- Database Functions ---</span>
<span class="hljs-function"><span class="hljs-keyword">fn</span> <span class="hljs-title">init_db</span></span>() -&gt; <span class="hljs-built_in">Result</span>&lt;Connection&gt; {
    <span class="hljs-keyword">let</span> conn = Connection::open(<span class="hljs-string">"prices.db"</span>)?;
    conn.execute(
        <span class="hljs-string">"CREATE TABLE IF NOT EXISTS bitcoin_prices (
            id INTEGER PRIMARY KEY,
            price_usd REAL,
            market_cap REAL,
            timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
        )"</span>,
        [],
    )?;
    <span class="hljs-literal">Ok</span>(conn)
}

<span class="hljs-function"><span class="hljs-keyword">fn</span> <span class="hljs-title">read_history</span></span>(conn: &amp;Connection) -&gt; <span class="hljs-built_in">Result</span>&lt;()&gt; {
    <span class="hljs-keyword">let</span> <span class="hljs-keyword">mut</span> stmt = conn.prepare(<span class="hljs-string">"SELECT id, price_usd, timestamp FROM bitcoin_prices ORDER BY id DESC LIMIT 5"</span>)?;
    <span class="hljs-keyword">let</span> rows = stmt.query_map([], |row| {
        <span class="hljs-literal">Ok</span>((
            row.get::&lt;_, <span class="hljs-built_in">i32</span>&gt;(<span class="hljs-number">0</span>)?,
            row.get::&lt;_, <span class="hljs-built_in">f64</span>&gt;(<span class="hljs-number">1</span>)?,
            row.get::&lt;_, <span class="hljs-built_in">String</span>&gt;(<span class="hljs-number">2</span>)?
        ))
    })?;

    <span class="hljs-built_in">println!</span>(<span class="hljs-string">"\n📊 Recent History (Last 5 records):"</span>);
    <span class="hljs-built_in">println!</span>(<span class="hljs-string">"-------------------------------------"</span>);
    <span class="hljs-keyword">for</span> row_result <span class="hljs-keyword">in</span> rows {
        <span class="hljs-keyword">let</span> (id, price, time) = row_result?;
        <span class="hljs-built_in">println!</span>(<span class="hljs-string">"[{}] {} | Price: ${:.2}"</span>, id, time, price);
    }
    <span class="hljs-literal">Ok</span>(())
}

<span class="hljs-comment">// --- Extract Function ---</span>
<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">fn</span> <span class="hljs-title">fetch_price</span></span>() -&gt; std::result::<span class="hljs-built_in">Result</span>&lt;CoinGeckoResponse, <span class="hljs-built_in">Box</span>&lt;<span class="hljs-keyword">dyn</span> Error&gt;&gt; {
    <span class="hljs-keyword">let</span> url = <span class="hljs-string">"https://api.coingecko.com/api/v3/simple/price?ids=bitcoin&amp;vs_currencies=usd&amp;include_market_cap=true"</span>;
    <span class="hljs-keyword">let</span> client = reqwest::Client::new();

    <span class="hljs-keyword">let</span> response = client
        .get(url)
        .header(<span class="hljs-string">"User-Agent"</span>, <span class="hljs-string">"MyCryptoIngestor/1.0"</span>) 
        .send()
        .<span class="hljs-keyword">await</span>?;

    <span class="hljs-keyword">if</span> !response.status().is_success() {
        <span class="hljs-keyword">return</span> <span class="hljs-literal">Err</span>(<span class="hljs-built_in">Box</span>::from(<span class="hljs-built_in">format!</span>(<span class="hljs-string">"API Request failed with status: {}"</span>, response.status())));
    }

    <span class="hljs-keyword">let</span> text_content = response.text().<span class="hljs-keyword">await</span>?;
    <span class="hljs-keyword">let</span> parsed_data: CoinGeckoResponse = <span class="hljs-keyword">match</span> serde_json::from_str(&amp;text_content) {
        <span class="hljs-literal">Ok</span>(data) =&gt; data,
        <span class="hljs-literal">Err</span>(e) =&gt; {
            eprintln!(<span class="hljs-string">"⚠️ RAW RESPONSE RECEIVED: {}"</span>, text_content);
            <span class="hljs-keyword">return</span> <span class="hljs-literal">Err</span>(<span class="hljs-built_in">Box</span>::new(e));
        }
    };
    <span class="hljs-literal">Ok</span>(parsed_data)
}

<span class="hljs-comment">// --- Main Orchestration ---</span>
<span class="hljs-meta">#[tokio::main]</span>
<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">fn</span> <span class="hljs-title">main</span></span>() -&gt; std::result::<span class="hljs-built_in">Result</span>&lt;(), <span class="hljs-built_in">Box</span>&lt;<span class="hljs-keyword">dyn</span> Error&gt;&gt; {
    <span class="hljs-built_in">println!</span>(<span class="hljs-string">"🚀 Starting Ingestion Pipeline..."</span>);

    <span class="hljs-comment">// 1. Initialize DB</span>
    <span class="hljs-keyword">let</span> conn = init_db()?;

    <span class="hljs-comment">// 2. Fetch Data (Extract)</span>
    <span class="hljs-built_in">println!</span>(<span class="hljs-string">"📥 Fetching data from API..."</span>);
    <span class="hljs-keyword">match</span> fetch_price().<span class="hljs-keyword">await</span> {
        <span class="hljs-literal">Ok</span>(data) =&gt; {
            <span class="hljs-keyword">let</span> price = data.bitcoin.usd;
            <span class="hljs-keyword">let</span> cap = data.bitcoin.usd_market_cap;
            <span class="hljs-built_in">println!</span>(<span class="hljs-string">"✅ Data received: BTC at ${}"</span>, price);

            <span class="hljs-comment">// 3. Insert into DB (Load)</span>
            conn.execute(
                <span class="hljs-string">"INSERT INTO bitcoin_prices (price_usd, market_cap) VALUES (?1, ?2)"</span>,
                params![price, cap],
            )?;
            <span class="hljs-built_in">println!</span>(<span class="hljs-string">"💾 Saved to SQLite database."</span>);

            <span class="hljs-comment">// 4. Verification</span>
            read_history(&amp;conn)?;
        }
        <span class="hljs-literal">Err</span>(e) =&gt; {
            eprintln!(<span class="hljs-string">"❌ Pipeline Failed: {}"</span>, e);
        }
    }

    <span class="hljs-literal">Ok</span>(())
}
</code></pre>
<h2 id="heading-running-the-pipeline">Running the Pipeline</h2>
<p>To run the application:</p>
<pre><code class="lang-bash">cargo run
</code></pre>
<p><strong>Expected Output:</strong></p>
<pre><code class="lang-bash">🚀 Starting Ingestion Pipeline...
📥 Fetching data from API...
✅ Data received: BTC at <span class="hljs-variable">$90610</span>
💾 Saved to SQLite database.

📊 Recent History (Last 5 records):
-------------------------------------
[1] 2024-01-10 12:30:01 | Price: <span class="hljs-variable">$90610</span>.00
</code></pre>
<h2 id="heading-conclusion">Conclusion</h2>
<p>We just built a fully functional data app in Rust. It handles network failures, parses JSON into strong types, and persists data to disk.</p>
<p>For a Data Engineer, the biggest shift is moving from "Scripting" (where you run code and hope it works) to "Systems Programming" (where you handle every possible error state upfront). It’s stricter, but the result is a pipeline that sleeps soundly at night.</p>
<hr />
]]></content:encoded></item><item><title><![CDATA[Escaping the Tutorial Loop: Building a CLI ETL Tool in Rust (From Scratch to Generic)]]></title><description><![CDATA[I’ve done the crash courses. I’ve read the book. But like many developers learning Rust, I kept getting stuck in the "tutorial loop". I understood the syntax, but I didn't know how to start a project.
As a Data Engineer, I decided to stop following g...]]></description><link>https://thedatacrab.com/escaping-the-tutorial-loop-building-a-cli-etl-tool-in-rust-from-scratch-to-generic</link><guid isPermaLink="true">https://thedatacrab.com/escaping-the-tutorial-loop-building-a-cli-etl-tool-in-rust-from-scratch-to-generic</guid><category><![CDATA[data-engineering]]></category><category><![CDATA[Rust]]></category><category><![CDATA[Programming Blogs]]></category><category><![CDATA[Tutorial]]></category><category><![CDATA[Beginner Developers]]></category><dc:creator><![CDATA[The Data Crab]]></dc:creator><pubDate>Fri, 09 Jan 2026 10:00:58 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1767954982274/6fb6b33e-1bd1-4557-b452-3a9797810078.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I’ve done the crash courses. I’ve read the book. But like many developers learning Rust, I kept getting stuck in the "tutorial loop". I understood the syntax, but I didn't know how to <em>start</em> a project.</p>
<p>As a Data Engineer, I decided to stop following generic tutorials and build something I actually understand: <strong>an ETL pipeline</strong>.</p>
<p>In this post, I’ll walk you through building <strong>"Rusty Pipe"</strong>, a command-line tool. We will start by building a strict, type-safe version, and then refactor it into a generic tool that can handle <em>any</em> CSV file.</p>
<h2 id="heading-step-1-the-setup">Step 1: The Setup</h2>
<p>First, we use <strong>Cargo</strong> (Rust's package manager) to create the project.</p>
<pre><code class="lang-bash">cargo new rusty_pipe
<span class="hljs-built_in">cd</span> rusty_pipe
</code></pre>
<p>We need a few "crates" (libraries). Add these to your <code>Cargo.toml</code>:</p>
<pre><code class="lang-bash">[dependencies]
clap = { version = <span class="hljs-string">"4.5"</span>, features = [<span class="hljs-string">"derive"</span>] } <span class="hljs-comment"># CLI Arguments</span>
serde = { version = <span class="hljs-string">"1.0"</span>, features = [<span class="hljs-string">"derive"</span>] } <span class="hljs-comment"># Serialization</span>
serde_json = <span class="hljs-string">"1.0"</span>
csv = <span class="hljs-string">"1.3"</span>
</code></pre>
<h2 id="heading-step-2-create-dummy-data">Step 2: Create Dummy Data</h2>
<p>Before we code, let's create the data we want to process. Create a file named <code>products.csv</code> in your project root:</p>
<pre><code class="lang-bash">id,name,category,price
1,Apple,Fruit,1.20
2,Laptop,Electronics,999.99
3,Banana,Fruit,0.50
4,TV,Electronics,500.00
</code></pre>
<h2 id="heading-step-3-version-1-the-strict-approach-type-safety">Step 3: Version 1 - The "Strict" Approach (Type Safety)</h2>
<p>In Python/Pandas, types are often inferred. In Rust, we usually define the shape of our data upfront using a <strong>Struct</strong>. This acts as a contract—if the CSV has bad data (like text in a price column), the program warns us immediately.</p>
<p>Here is the code for <code>src/</code><a target="_blank" href="http://main.rs"><code>main.rs</code></a>. It reads the CSV, filters out cheap items (price &lt; $1.00), and writes to JSON.</p>
<pre><code class="lang-rust"><span class="hljs-keyword">use</span> clap::Parser;
<span class="hljs-keyword">use</span> serde::{Deserialize, Serialize};
<span class="hljs-keyword">use</span> std::error::Error;
<span class="hljs-keyword">use</span> std::fs;

<span class="hljs-comment">// 1. Define CLI Arguments</span>
<span class="hljs-meta">#[derive(Parser)]</span>
<span class="hljs-class"><span class="hljs-keyword">struct</span> <span class="hljs-title">Cli</span></span> {
    input: <span class="hljs-built_in">String</span>,
    output: <span class="hljs-built_in">String</span>,
}

<span class="hljs-comment">// 2. Define the Schema (The Contract)</span>
<span class="hljs-meta">#[derive(Debug, Serialize, Deserialize)]</span>
<span class="hljs-class"><span class="hljs-keyword">struct</span> <span class="hljs-title">Product</span></span> {
    id: <span class="hljs-built_in">u32</span>,
    name: <span class="hljs-built_in">String</span>,
    category: <span class="hljs-built_in">String</span>,
    price: <span class="hljs-built_in">f64</span>,
}

<span class="hljs-function"><span class="hljs-keyword">fn</span> <span class="hljs-title">main</span></span>() -&gt; <span class="hljs-built_in">Result</span>&lt;(), <span class="hljs-built_in">Box</span>&lt;<span class="hljs-keyword">dyn</span> Error&gt;&gt; {
    <span class="hljs-keyword">let</span> args = Cli::parse();
    <span class="hljs-keyword">let</span> <span class="hljs-keyword">mut</span> rdr = csv::Reader::from_path(args.input)?;

    <span class="hljs-keyword">let</span> <span class="hljs-keyword">mut</span> clean_data: <span class="hljs-built_in">Vec</span>&lt;Product&gt; = <span class="hljs-built_in">Vec</span>::new();

    <span class="hljs-comment">// 3. Stream the data (Memory Efficient!)</span>
    <span class="hljs-keyword">for</span> result <span class="hljs-keyword">in</span> rdr.deserialize() {
        <span class="hljs-keyword">let</span> record: Product = <span class="hljs-keyword">match</span> result {
            <span class="hljs-literal">Ok</span>(rec) =&gt; rec,
            <span class="hljs-literal">Err</span>(e) =&gt; {
                eprintln!(<span class="hljs-string">"Skipping bad row: {}"</span>, e);
                <span class="hljs-keyword">continue</span>;
            }
        };

        <span class="hljs-comment">// 4. Business Logic: Filter cheap products</span>
        <span class="hljs-keyword">if</span> record.price &gt; <span class="hljs-number">1.0</span> {
            clean_data.push(record);
        }
    }

    <span class="hljs-comment">// 5. Write to JSON</span>
    <span class="hljs-keyword">let</span> json_output = serde_json::to_string_pretty(&amp;clean_data)?;
    fs::write(args.output, json_output)?;

    <span class="hljs-built_in">println!</span>(<span class="hljs-string">"Success! Processed {} records."</span>, clean_data.len());
    <span class="hljs-literal">Ok</span>(())
}
</code></pre>
<h3 id="heading-running-version-1">Running Version 1</h3>
<p>Run this in your terminal:</p>
<pre><code class="lang-bash">cargo run -- products.csv output.json
</code></pre>
<p>If you check <code>output.json</code>, you will see it correctly filtered out the "Banana" (which was $0.50):</p>
<pre><code class="lang-json">[
  {
    <span class="hljs-attr">"id"</span>: <span class="hljs-number">1</span>,
    <span class="hljs-attr">"name"</span>: <span class="hljs-string">"Apple"</span>,
    <span class="hljs-attr">"category"</span>: <span class="hljs-string">"Fruit"</span>,
    <span class="hljs-attr">"price"</span>: <span class="hljs-number">1.2</span>
  },
  {
    <span class="hljs-attr">"id"</span>: <span class="hljs-number">2</span>,
    <span class="hljs-attr">"name"</span>: <span class="hljs-string">"Laptop"</span>,
    <span class="hljs-attr">"category"</span>: <span class="hljs-string">"Electronics"</span>,
    <span class="hljs-attr">"price"</span>: <span class="hljs-number">999.99</span>
  },
  {
    <span class="hljs-attr">"id"</span>: <span class="hljs-number">4</span>,
    <span class="hljs-attr">"name"</span>: <span class="hljs-string">"TV"</span>,
    <span class="hljs-attr">"category"</span>: <span class="hljs-string">"Electronics"</span>,
    <span class="hljs-attr">"price"</span>: <span class="hljs-number">500.0</span>
  }
]
</code></pre>
<h2 id="heading-step-4-the-refactor-making-it-generic">Step 4: The Refactor - Making it Generic</h2>
<p>The strict version is great for production pipelines where the schema is known. But what if I want to use this tool on <em>any</em> CSV file, regardless of columns?</p>
<p>We can swap the <code>Struct</code> for a <code>HashMap</code>. This is closer to how Python works, dynamic and flexible.</p>
<p><strong>Changes required:</strong></p>
<ol>
<li><p>Remove <code>struct Product</code>.</p>
</li>
<li><p>Import <code>std::collections::HashMap</code>.</p>
</li>
<li><p>Change <code>Vec&lt;Product&gt;</code> to <code>Vec&lt;HashMap&lt;String, String&gt;&gt;</code>.</p>
</li>
</ol>
<p>Here is the updated <code>src/</code><a target="_blank" href="http://main.rs"><code>main.rs</code></a>:</p>
<pre><code class="lang-rust"><span class="hljs-keyword">use</span> clap::Parser;
<span class="hljs-keyword">use</span> std::error::Error;
<span class="hljs-keyword">use</span> std::fs;
<span class="hljs-keyword">use</span> std::collections::HashMap; <span class="hljs-comment">// Import HashMap</span>

<span class="hljs-meta">#[derive(Parser)]</span>
<span class="hljs-class"><span class="hljs-keyword">struct</span> <span class="hljs-title">Cli</span></span> {
    input: <span class="hljs-built_in">String</span>,
    output: <span class="hljs-built_in">String</span>,
}

<span class="hljs-comment">// <span class="hljs-doctag">NOTE:</span> We removed the Product struct!</span>

<span class="hljs-function"><span class="hljs-keyword">fn</span> <span class="hljs-title">main</span></span>() -&gt; <span class="hljs-built_in">Result</span>&lt;(), <span class="hljs-built_in">Box</span>&lt;<span class="hljs-keyword">dyn</span> Error&gt;&gt; {
    <span class="hljs-keyword">let</span> args = Cli::parse();
    <span class="hljs-keyword">let</span> <span class="hljs-keyword">mut</span> rdr = csv::Reader::from_path(args.input)?;

    <span class="hljs-comment">// Change the Vector to store Maps (Key=String, Value=String)</span>
    <span class="hljs-keyword">let</span> <span class="hljs-keyword">mut</span> clean_data: <span class="hljs-built_in">Vec</span>&lt;HashMap&lt;<span class="hljs-built_in">String</span>, <span class="hljs-built_in">String</span>&gt;&gt; = <span class="hljs-built_in">Vec</span>::new();

    <span class="hljs-keyword">for</span> result <span class="hljs-keyword">in</span> rdr.deserialize() {
        <span class="hljs-comment">// Rust automatically maps Header-&gt;Key, Row-&gt;Value</span>
        <span class="hljs-keyword">let</span> record: HashMap&lt;<span class="hljs-built_in">String</span>, <span class="hljs-built_in">String</span>&gt; = <span class="hljs-keyword">match</span> result {
            <span class="hljs-literal">Ok</span>(rec) =&gt; rec,
            <span class="hljs-literal">Err</span>(e) =&gt; {
                eprintln!(<span class="hljs-string">"Skipping bad row: {}"</span>, e);
                <span class="hljs-keyword">continue</span>;
            }
        };

        <span class="hljs-comment">// We removed the price filtering logic because we </span>
        <span class="hljs-comment">// don't know if a "price" column exists in a generic file!</span>
        clean_data.push(record);
    }

    <span class="hljs-keyword">let</span> json_output = serde_json::to_string_pretty(&amp;clean_data)?;
    fs::write(args.output, json_output)?;

    <span class="hljs-built_in">println!</span>(<span class="hljs-string">"Success! Converted {} records."</span>, clean_data.len());
    <span class="hljs-literal">Ok</span>(())
}
</code></pre>
<p>Now, you can run this same tool on <code>products.csv</code>, or <code>users.csv</code>, or <code>logs.csv</code>. It has become a universal converter!</p>
<h2 id="heading-step-5-the-grand-finale-install-it-globally">Step 5: The Grand Finale - Install It Globally</h2>
<p>Right now, we are running our tool using <code>cargo run</code>. That’s fine for development, but in production, we want a standalone tool that we can run from anywhere—just like <code>grep</code>, <code>jq</code>, or <code>python</code>.</p>
<h3 id="heading-1-build-for-release">1. Build for Release</h3>
<p>Cargo compiles in "debug" mode by default (which is fast to compile but slow to run). Let's build a highly optimized <strong>release binary</strong>.</p>
<pre><code class="lang-bash">cargo build --release
</code></pre>
<p>This creates a standalone executable file at <code>./target/release/rusty_pipe</code>. You can literally email this file to a friend with the same OS, and it will run—they don't need Rust installed!</p>
<h3 id="heading-2-install-to-your-system-path">2. Install to your System Path</h3>
<p>To make it available globally in your terminal, copy it to your <code>bin</code> folder.</p>
<p><strong>For Mac/Linux:</strong></p>
<pre><code class="lang-bash">sudo cp ./target/release/rusty_pipe /usr/<span class="hljs-built_in">local</span>/bin/
</code></pre>
<p><strong>For Windows:</strong> You can copy the <code>.exe</code> to any folder that is in your system <code>PATH</code>.</p>
<h3 id="heading-3-run-it-like-a-pro">3. Run it like a Pro</h3>
<p>Close your terminal, open a new one, navigate to your Desktop (or any folder with a CSV), and run:</p>
<pre><code class="lang-bash">rusty_pipe input.csv output.json
</code></pre>
<p>Congratulations! You just built and installed your own system-level CLI tool.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>We built two versions of an ETL tool:</p>
<ol>
<li><p><strong>The Strict Version:</strong> Uses <code>structs</code>. Best for known data, ensures type safety, and allows easy filtering (e.g., <code>price &gt; 1.0</code>).</p>
</li>
<li><p><strong>The Generic Version:</strong> Uses <code>HashMaps</code>. Best for general-purpose utilities where the column names aren't known ahead of time.</p>
</li>
</ol>
<p>This project covers the core concepts of Rust for Data Engineering: <strong>Cargo, Structs, Iterators, and Serde</strong>.</p>
<hr />
]]></content:encoded></item></channel></rss>