Building a High-Performance Company Name Standardizer with Rust, React, and the SEC API
Stop hardcoding mappings: A dynamic approach to financial data standardization.

Data Engineers know the pain: you receive a dataset where one row says "JPMC", another says "J.P. Morgan", and a third says "JPMorgan Chase & Co.". To perform any meaningful aggregation, you need to normalize these into a single "Golden Record."
While Python (scikit-learn / fuzzywuzzy) is the go-to for this, it can be slow at scale and requires heavy dependencies.
In this post, I’ll show you how to build a blazing fast, production-grade Standardization Engine using Rust for the backend and React for the UI. We will go beyond simple regex and implement:
Real-time Master Data: Fetching live tickers from the SEC (US Securities and Exchange Commission).
Fuzzy Logic: Using the Jaro-Winkler algorithm to handle typos ("Amzn" → "Amazon").
Modern UI: A polished React frontend with Dark Mode and CSV Export.
🏗️ The Architecture
Backend: Rust (
Actix-web)Speed: Handles thousands of requests per second.
Logic:
strsimcrate for string distance calculations.Data: Fetches
company_tickers.jsonfrom SEC.gov on startup.
Frontend: React (
Vite)- UX: Real-time lookup, Batch CSV processing, Dark/Light mode toggle.
🚀 Part 1: The Rust Backend
We use Actix-web for the server and strsim for the fuzzy matching logic.
1. Initialize the Project
mkdir company-standardizer
cd company-standardizer
cargo new backend
cd backend
2. Dependencies (Cargo.toml)
We need reqwest to fetch the SEC data and strsim for the math.
[package]
name = "backend"
version = "0.1.0"
edition = "2021"
[dependencies]
actix-web = "4"
actix-cors = "0.6"
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
strsim = "0.11" # The magic fuzzy matching crate
reqwest = { version = "0.11", features = ["json", "blocking"] }
tokio = { version = "1", features = ["full"] }
3. The Logic (src/main.rs)
This is the core engine. It employs a "Waterfall" matching strategy:
Ticker Map: O(1) Lookup (e.g., "MSFT" -> "Microsoft").
Jaro-Winkler: Measures string similarity. If the score > 0.75, we consider it a match.
use actix_cors::Cors;
use actix_web::{web, App, HttpResponse, HttpServer, Responder};
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
use std::sync::Mutex;
use strsim::jaro_winkler;
// --- Data Structures ---
#[derive(Clone, Serialize, Deserialize, Debug)]
struct CompanyReference {
title: String,
ticker: String,
}
#[derive(Deserialize, Debug)]
struct SecCompany {
ticker: String,
title: String,
}
#[derive(Serialize, Debug)]
struct MatchResult {
input: String,
standardized_name: Option<String>,
ticker: Option<String>,
score: f64,
method: String,
}
#[derive(Deserialize)]
struct LookupRequest { name: String }
#[derive(Deserialize)]
struct BatchRequest { names: Vec<String> }
// --- The Engine ---
struct StandardizationEngine {
master_list: Vec<CompanyReference>,
ticker_map: HashMap<String, String>,
}
impl StandardizationEngine {
fn new(companies: Vec<CompanyReference>) -> Self {
let mut ticker_map = HashMap::new();
for company in &companies {
ticker_map.insert(company.ticker.clone(), company.title.clone());
}
StandardizationEngine { master_list: companies, ticker_map }
}
fn search(&self, query: &str) -> MatchResult {
let clean_query = query.trim();
let query_upper = clean_query.to_uppercase();
// 1. Ticker Lookup (Exact & Fast)
if let Some(name) = self.ticker_map.get(&query_upper) {
return MatchResult {
input: query.to_string(),
standardized_name: Some(name.clone()),
ticker: Some(query_upper),
score: 1.0,
method: "Ticker Map".to_string(),
};
}
// 2. Fuzzy Search (Jaro-Winkler)
let mut best_match: Option<CompanyReference> = None;
let mut best_score = 0.0;
for company in &self.master_list {
let score = jaro_winkler(&query_upper, &company.title.to_uppercase());
if score > best_score {
best_score = score;
best_match = Some(company.clone());
}
}
let method = if best_score == 1.0 { "Exact Match" } else { "Jaro-Winkler Fuzzy" };
// Threshold > 0.75 usually indicates a good match for company names
if best_score > 0.75 {
let m = best_match.unwrap();
MatchResult {
input: query.to_string(),
standardized_name: Some(m.title),
ticker: Some(m.ticker),
score: best_score,
method: method.to_string(),
}
} else {
MatchResult {
input: query.to_string(),
standardized_name: None,
ticker: None,
score: best_score,
method: "No Match".to_string(),
}
}
}
}
struct AppState {
engine: Mutex<StandardizationEngine>,
}
// --- API Handlers ---
async fn standardize_one(data: web::Data<AppState>, req: web::Json<LookupRequest>) -> impl Responder {
let engine = data.engine.lock().unwrap();
let result = engine.search(&req.name);
HttpResponse::Ok().json(result)
}
async fn standardize_batch(data: web::Data<AppState>, req: web::Json<BatchRequest>) -> impl Responder {
let engine = data.engine.lock().unwrap();
let results: Vec<MatchResult> = req.names.iter().map(|name| engine.search(name)).collect();
HttpResponse::Ok().json(results)
}
// --- Data Loader ---
async fn fetch_sec_data() -> Vec<CompanyReference> {
println!("Fetching live data from SEC.gov...");
let client = reqwest::Client::new();
// SEC requires a user-agent
let resp = client.get("https://www.sec.gov/files/company_tickers.json")
.header("User-Agent", "TestApp (test@example.com)")
.send().await;
if let Ok(response) = resp {
if let Ok(map) = response.json::<HashMap<String, SecCompany>>().await {
let mut companies = Vec::new();
for (_, val) in map {
companies.push(CompanyReference { title: val.title, ticker: val.ticker });
}
println!("Loaded {} companies.", companies.len());
return companies;
}
}
// Fallback data if API fails
vec![CompanyReference { title: "Apple Inc.".into(), ticker: "AAPL".into() }]
}
#[actix_web::main]
async fn main() -> std::io::Result<()> {
let companies = fetch_sec_data().await;
let engine = StandardizationEngine::new(companies);
let app_state = web::Data::new(AppState { engine: Mutex::new(engine) });
println!("Server running at http://127.0.0.1:8080");
HttpServer::new(move || {
App::new()
.wrap(Cors::permissive())
.app_data(app_state.clone())
.route("/api/standardize", web::post().to(standardize_one))
.route("/api/batch-standardize", web::post().to(standardize_batch))
})
.bind(("127.0.0.1", 8080))?
.run()
.await
}
🎨 Part 2: The React Frontend
We use Vite for a fast setup. The frontend features a beautiful card-based layout, a Dark Mode toggle, and CSV export functionality.
1. Setup
cd .. # go to the root directory
npm create vite@latest frontend -- --template react
cd frontend
npm install axios
2. Styling (App.css)
This CSS uses variables for instant theme switching.
:root {
/* Light Theme Variables */
--bg-color: #f8f9fa;
--card-bg: #ffffff;
--text-main: #2c3e50;
--text-secondary: #5f6368;
--accent-color: #2563eb;
--accent-hover: #1d4ed8;
--border-color: #e2e8f0;
--input-bg: #ffffff;
--table-header: #f1f5f9;
--table-hover: #f8fafc;
--shadow: 0 4px 6px -1px rgba(0, 0, 0, 0.1), 0 2px 4px -1px rgba(0, 0, 0, 0.06);
}
[data-theme='dark'] {
/* Dark Theme Variables */
--bg-color: #0f172a;
--card-bg: #1e293b;
--text-main: #f1f5f9;
--text-secondary: #94a3b8;
--accent-color: #3b82f6;
--accent-hover: #60a5fa;
--border-color: #334155;
--input-bg: #0f172a;
--table-header: #334155;
--table-hover: #1e293b; /* Slightly lighter than card bg for hover effect if needed */
--shadow: 0 4px 6px -1px rgba(0, 0, 0, 0.5);
}
body {
margin: 0;
background-color: var(--bg-color);
color: var(--text-main);
font-family: 'Inter', system-ui, -apple-system, sans-serif;
transition: background-color 0.3s, color 0.3s;
min-height: 100vh;
display: flex;
justify-content: center;
}
#root {
width: 100%;
max-width: 1000px;
padding: 2rem;
text-align: center;
}
/* Header & Toggle */
header {
display: flex;
justify-content: space-between;
align-items: center;
margin-bottom: 2rem;
padding-bottom: 1rem;
border-bottom: 1px solid var(--border-color);
}
h1 {
font-size: 2rem;
margin: 0;
background: linear-gradient(90deg, var(--accent-color), #8b5cf6);
-webkit-background-clip: text;
-webkit-text-fill-color: transparent;
}
.theme-toggle {
background: none;
border: 1px solid var(--border-color);
color: var(--text-main);
font-size: 1.2rem;
padding: 0.5rem;
border-radius: 50%;
cursor: pointer;
transition: all 0.2s;
width: 40px;
height: 40px;
display: flex;
align-items: center;
justify-content: center;
}
.theme-toggle:hover {
background-color: var(--border-color);
}
/* Cards */
.card {
background-color: var(--card-bg);
padding: 2rem;
border-radius: 12px;
box-shadow: var(--shadow);
border: 1px solid var(--border-color);
margin-bottom: 2rem;
transition: background-color 0.3s;
}
h2 {
margin-top: 0;
color: var(--text-main);
font-size: 1.25rem;
margin-bottom: 1.5rem;
text-align: left;
}
/* Inputs & Buttons */
.search-box {
display: flex;
gap: 12px;
}
input[type="text"] {
flex: 1;
padding: 12px 16px;
border-radius: 8px;
border: 1px solid var(--border-color);
background-color: var(--input-bg);
color: var(--text-main);
font-size: 1rem;
outline: none;
transition: border-color 0.2s;
}
input[type="text"]:focus {
border-color: var(--accent-color);
box-shadow: 0 0 0 2px rgba(37, 99, 235, 0.2);
}
button.primary-btn {
background-color: var(--accent-color);
color: white;
border: none;
padding: 12px 24px;
border-radius: 8px;
font-weight: 600;
cursor: pointer;
transition: background-color 0.2s;
}
button.primary-btn:hover {
background-color: var(--accent-hover);
}
button:disabled {
opacity: 0.7;
cursor: not-allowed;
}
/* File Upload */
.file-upload-label {
display: inline-block;
padding: 12px 24px;
border: 2px dashed var(--border-color);
border-radius: 8px;
cursor: pointer;
color: var(--text-secondary);
font-weight: 500;
width: 100%;
box-sizing: border-box;
transition: all 0.2s;
}
.file-upload-label:hover {
border-color: var(--accent-color);
color: var(--accent-color);
background-color: rgba(37, 99, 235, 0.05);
}
/* Results Grid (Single) */
.result-grid {
display: grid;
grid-template-columns: repeat(auto-fit, minmax(200px, 1fr));
gap: 1.5rem;
margin-top: 1.5rem;
text-align: left;
padding-top: 1.5rem;
border-top: 1px solid var(--border-color);
}
.result-item label {
display: block;
font-size: 0.85rem;
color: var(--text-secondary);
margin-bottom: 0.25rem;
text-transform: uppercase;
letter-spacing: 0.05em;
}
.result-item .value {
font-size: 1.1rem;
font-weight: 600;
color: var(--text-main);
}
.ticker-badge {
background-color: var(--border-color);
color: var(--text-main);
padding: 2px 6px;
border-radius: 4px;
font-size: 0.8rem;
margin-left: 8px;
vertical-align: middle;
}
/* Table */
.table-container {
overflow-x: auto;
border-radius: 8px;
border: 1px solid var(--border-color);
}
table {
width: 100%;
border-collapse: collapse;
}
th {
background-color: var(--table-header);
color: var(--text-secondary);
font-weight: 600;
padding: 12px 16px;
text-align: left;
font-size: 0.9rem;
}
td {
padding: 12px 16px;
border-top: 1px solid var(--border-color);
color: var(--text-main);
text-align: left;
}
tr:hover td {
background-color: var(--table-hover); /* Fixed hover visibility */
}
/* Utilities */
.status-badge {
padding: 4px 8px;
border-radius: 4px;
font-size: 0.85rem;
font-weight: 600;
}
3. The Logic (App.jsx)
Features:
Batch Processing: Upload a
.txtfile, get a table of results.CSV Download: Automatically generates a report of the standardization.
Theme Toggle: Persists your choice in
localStorage.
import { useState, useEffect } from 'react'
import axios from 'axios'
import './App.css'
function App() {
const [inputText, setInputText] = useState('')
const [result, setResult] = useState(null)
const [batchResults, setBatchResults] = useState([])
const [loading, setLoading] = useState(false)
const [theme, setTheme] = useState('light')
// Theme Init
useEffect(() => {
const savedTheme = localStorage.getItem('theme')
const prefersDark = window.matchMedia('(prefers-color-scheme: dark)').matches
if (savedTheme) {
setTheme(savedTheme)
document.documentElement.setAttribute('data-theme', savedTheme)
} else if (prefersDark) {
setTheme('dark')
document.documentElement.setAttribute('data-theme', 'dark')
}
}, [])
const toggleTheme = () => {
const newTheme = theme === 'light' ? 'dark' : 'light'
setTheme(newTheme)
document.documentElement.setAttribute('data-theme', newTheme)
localStorage.setItem('theme', newTheme)
}
// --- API Handlers ---
const handleCheck = async () => {
if (!inputText) return
setLoading(true)
try {
const res = await axios.post('http://127.0.0.1:8080/api/standardize', { name: inputText })
setResult(res.data)
} catch (err) {
console.error(err)
alert("Error connecting to backend")
}
setLoading(false)
}
const handleFileUpload = (e) => {
const file = e.target.files[0]
if (!file) return
const reader = new FileReader()
reader.onload = async (event) => {
const text = event.target.result
const names = text.split('\n').map(n => n.trim()).filter(n => n)
setLoading(true)
try {
const res = await axios.post('http://127.0.0.1:8080/api/batch-standardize', { names })
setBatchResults(res.data)
} catch (err) {
console.error(err)
alert("Error processing batch")
}
setLoading(false)
}
reader.readAsText(file)
}
// --- CSV Download Logic ---
const downloadCSV = () => {
if (batchResults.length === 0) return
// 1. Define Headers
const headers = ["Input Name", "Standardized Name", "Ticker", "Method", "Confidence Score"]
// 2. Format Data Rows
const csvRows = batchResults.map(item => {
// Escape quotes in data to prevent CSV breakage
const safeInput = `"${item.input.replace(/"/g, '""')}"`
const safeName = `"${(item.standardized_name || "").replace(/"/g, '""')}"`
const safeTicker = `"${(item.ticker || "").replace(/"/g, '""')}"`
const safeMethod = `"${item.method}"`
const safeScore = `"${(item.score * 100).toFixed(1)}%"`
return [safeInput, safeName, safeTicker, safeMethod, safeScore].join(",")
})
// 3. Combine and Trigger Download
const csvContent = [headers.join(","), ...csvRows].join("\n")
const blob = new Blob([csvContent], { type: 'text/csv;charset=utf-8;' })
const url = URL.createObjectURL(blob)
const link = document.createElement("a")
link.href = url
link.setAttribute("download", "standardized_companies_report.csv")
document.body.appendChild(link)
link.click()
document.body.removeChild(link)
}
// --- Render Helpers ---
const getScoreColor = (score) => {
if (score === 1.0) return '#22c55e'
if (score > 0.85) return '#3b82f6'
if (score > 0.75) return '#f59e0b'
return '#ef4444'
}
return (
<div className="app-container">
<header>
<div>
<h1>Company Standardizer</h1>
<span style={{ color: 'var(--text-secondary)', fontSize: '0.9rem' }}>
Powered by Rust & SEC API
</span>
</div>
<button className="theme-toggle" onClick={toggleTheme} title="Toggle Theme">
{theme === 'light' ? '🌙' : '☀️'}
</button>
</header>
{/* Single Lookup Card */}
<section className="card">
<h2>Single Entity Lookup</h2>
<div className="search-box">
<input
type="text"
value={inputText}
onChange={(e) => setInputText(e.target.value)}
placeholder="Enter Company Name or Ticker (e.g., Apple, NVDA)"
onKeyDown={(e) => e.key === 'Enter' && handleCheck()}
/>
<button className="primary-btn" onClick={handleCheck} disabled={loading}>
{loading ? 'Searching...' : 'Search'}
</button>
</div>
{result && (
<div className="result-grid">
<div className="result-item">
<label>Input</label>
<div className="value">{result.input}</div>
</div>
<div className="result-item">
<label>Standardized Name</label>
<div className="value" style={{ display: 'flex', flexDirection: 'column', alignItems: 'flex-start', gap: '5px' }}>
{result.standardized_name ? (
<>
<span>{result.standardized_name}</span>
{result.ticker && (
<span className="ticker-badge" title="Stock Ticker">
{result.ticker}
</span>
)}
</>
) : (
<span style={{color: 'var(--text-secondary)'}}>No Match</span>
)}
</div>
</div>
<div className="result-item">
<label>Method</label>
<div className="value" style={{ fontSize: '0.95rem' }}>{result.method}</div>
</div>
<div className="result-item">
<label>Confidence</label>
<div className="value" style={{ color: getScoreColor(result.score) }}>
{(result.score * 100).toFixed(1)}%
</div>
</div>
</div>
)}
</section>
{/* Batch Processing Card */}
<section className="card">
<div style={{ display: 'flex', justifyContent: 'space-between', alignItems: 'center', marginBottom: '1.5rem' }}>
<h2 style={{ marginBottom: 0 }}>Batch Processing</h2>
{/* Download Button - Only shows when there are results */}
{batchResults.length > 0 && (
<button
onClick={downloadCSV}
style={{
backgroundColor: '#22c55e', // Green for Excel/CSV
color: 'white',
border: 'none',
padding: '8px 16px',
borderRadius: '6px',
cursor: 'pointer',
fontSize: '0.9rem',
fontWeight: '600',
boxShadow: '0 2px 4px rgba(0,0,0,0.1)'
}}
>
⬇ Download CSV
</button>
)}
</div>
<div style={{ marginBottom: '1.5rem' }}>
<label className="file-upload-label">
<input
type="file"
onChange={handleFileUpload}
accept=".txt,.csv"
style={{ display: 'none' }}
/>
{loading ? "Processing..." : "Click to Upload .txt or .csv List"}
</label>
</div>
{loading && batchResults.length === 0 && <p style={{color: 'var(--text-secondary)'}}>Processing batch file...</p>}
{batchResults.length > 0 && (
<div className="table-container">
<table>
<thead>
<tr>
<th>Input</th>
<th>Standardized Name</th>
<th>Ticker</th>
<th>Method</th>
<th>Score</th>
</tr>
</thead>
<tbody>
{batchResults.map((item, idx) => (
<tr key={idx}>
<td>{item.input}</td>
<td>{item.standardized_name || "-"}</td>
<td>
{item.ticker ? <span className="ticker-badge">{item.ticker}</span> : <span style={{color:'var(--text-secondary)'}}>-</span>}
</td>
<td>{item.method}</td>
<td>
<span
className="status-badge"
style={{
backgroundColor: `${getScoreColor(item.score)}20`,
color: getScoreColor(item.score)
}}
>
{(item.score * 100).toFixed(0)}%
</span>
</td>
</tr>
))}
</tbody>
</table>
</div>
)}
</section>
</div>
)
}
export default App
⚡ How to Run
Follow these exact steps to run the application locally.
1. Start the Backend
Open a terminal in the backend/ directory:
# This compiles the Rust code and starts the server
cargo run
Wait until you see: Successfully loaded X companies... Server running at http://127.0.0.1:8080.
2. Start the Frontend
Open a new terminal window in the frontend/ directory:
# Install dependencies (only needed the first time)
npm install
# Start the dev server
npm run dev
Click the URL displayed (usually http://localhost:5173).
3. Test It Out
Exact Match: Type
MSFT. The system detects the ticker and returns Microsoft Corporation.Fuzzy Match: Type
Amzn. The system uses Jaro-Winkler logic to match it to Amazon.com Inc. with high confidence.Batch: Create a text file named
test.txtwith the content:Appl Teslaa JPMvcUpload it, view the table, and click Download CSV to get your report.

Conclusion
By combining the performance of Rust with the interactivity of React, we've built a tool that solves a real Data Engineering problem. This architecture is easily extensible, you could swap the SEC API for an internal SQL database or add machine learning models for even smarter matching.

