<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>rishit chaturvedi · writing</title>
    <link>https://rishit.best/writing</link>
    <atom:link href="https://rishit.best/rss.xml" rel="self" type="application/rss+xml" />
    <description>occasional essays on hiring, ai agents, and founders.</description>
    <language>en-us</language>
    <lastBuildDate>Fri, 24 Apr 2026 15:53:07 GMT</lastBuildDate>
    
    <item>
      <title>you don&apos;t actually want what you say you want in a hire</title>
      <link>https://rishit.best/writing/you-dont-want-what-you-say</link>
      <guid isPermaLink="true">https://rishit.best/writing/you-dont-want-what-you-say</guid>
      <pubDate>Wed, 22 Apr 2026 00:00:00 GMT</pubDate>
      <description>every hiring manager describes two candidates. the one they say they want, and the one they actually hire. these are rarely the same person.</description>
      <content:encoded><![CDATA[
every hiring manager i've ever talked to has told me two versions of what they want in a candidate.

the first version lives in the job description, the intake call, the "must-haves" spreadsheet. it uses words like "scrappy," "high-agency," "first-principles thinker." it tells a story about the team's values. it signals what the team wants to *believe* about itself.

the second version is the one that actually gets hired. you see it in the debrief notes, in the offer letter, in who the hiring manager keeps talking about a week later. it's often a very different person.

one client, last quarter, told us with a straight face that they wanted "scrappy startup energy." their last three hires had all come from google. another insisted on "mission-driven, not resume-driven" candidates, then rejected a founder who pivoted out of a failed startup because the founder "hadn't shown consistent growth."

this isn't lying. nobody's being dishonest. it's a gap between declared preferences and revealed preferences, and it shows up everywhere humans evaluate humans.

## why the gap exists

three things conspire to open it.

**memory decay.** a one-hour interview is a lot of information. by the time the debrief happens three days later, the interviewer remembers about four things: two moments they liked, one moment they didn't, and the overall vibe. everything else rounds to zero. the things that survive are rarely the things they said they cared about. they're the things that were most *memorable*, which is a different filter entirely.

**halo effects.** a candidate who is articulate in the first ten minutes gets graded more generously on every dimension for the rest of the loop. a candidate who stumbles early has to climb out of a hole the interviewer doesn't know they dug. "communication" wasn't on the rubric. it doesn't need to be. it warps the rubric anyway.

**social debrief pressure.** the person who speaks first sets the frame. the second person calibrates to them. by the time the fifth interviewer weighs in, the decision is half-made, and they either agree or burn social capital dissenting. debriefs look like evidence. they're mostly vibes, laundered through a conference room.

## what we see in the data

here's the pattern i keep seeing in the wild. the single most predictive variable of whether a candidate gets hired is not the rubric score. it's whether the hiring manager *spoke first* in the debrief and whether they came in hot.

"came in hot" is not a technical term. but you know it when you see it. the hiring manager walks in, and before anyone else says anything, says "yeah i really liked them." everyone else exhales and agrees.

the opposite also happens. "i don't know, i had some concerns." exhale. agree.

the rubric exists. the rubric is followed. the rubric doesn't decide.

## the fix isn't more rubric

the usual response to this is: more structure. tighter rubrics. calibration sessions. scorecards. and all of those help, a little. but they fight the wrong battle. the rubric isn't the problem. the *gap* between the rubric and the decision is the problem.

what you actually want is a system that can see both layers at once. the declared preferences (what you said you wanted). the revealed preferences (who you actually chose, and why). and then surface the delta, so the team can decide whether the delta is a bug or a feature.

sometimes the delta is a feature. maybe your declared rubric was wrong and your gut was right. maybe "scrappy startup energy" was never the real hiring bar, and hiring senior googlers is the correct move. great, update the rubric.

sometimes the delta is a bug. you hired the person who reminded the ceo of himself. that's not a signal. that's a mirror.

you can't fix what you can't see.

## what mazle does about it

we auto-capture every interview. structure the signal in the moment. pull the actual debrief conversation, not just the scorecard. and show teams, over time, the gap between what they say they reward and what they actually reward.

it's uncomfortable. we've had hiring managers stare at three months of their own data and go very quiet.

that's the point.

the gap between what you say and what you do isn't bad. it's just invisible. that's the problem.
]]></content:encoded>
    </item>
    <item>
      <title>i built a 30,000-line product in a month. here&apos;s what that actually looked like.</title>
      <link>https://rishit.best/writing/built-30k-lines-in-a-month</link>
      <guid isPermaLink="true">https://rishit.best/writing/built-30k-lines-in-a-month</guid>
      <pubDate>Wed, 08 Apr 2026 00:00:00 GMT</pubDate>
      <description>not a victory lap. not a vibe-coding essay. a diary of four weeks of building mazle with claude code, and where the real leverage actually showed up.</description>
      <content:encoded><![CDATA[
this isn't a victory lap. it isn't another "ai replaced my engineers" essay. it's a diary.

four weeks ago i had a spec, a domain, and a pretty clear idea of what mazle needed to do. i had no engineering team yet. i opened a terminal, started claude code, and just started building.

by week four there was about 30,000 lines of code in the repo and a working product on the other side of the screen. that's the headline. the interesting stuff is what happened in between.

## week 1: the setup

the first week was not about shipping features. it was about setting up a loop i could iterate in.

i picked a boring stack. next.js 14, postgres, prisma, tailwind, vercel. nothing exotic. the goal was to pick the stack that the model had seen the most of, so it would make the fewest weird suggestions.

the main decision was about how i worked, not what i built. i settled on three rules pretty quickly:

1. write a detailed spec before every meaningful feature. not a ticket. a *document*. what it does, what the edge cases are, what the data model looks like, what it explicitly will not do.

2. never accept a diff i don't understand. if claude writes something i can't read line-by-line, i reject it and rewrite the spec until the output is readable.

3. commit every 30 minutes. because i was going to want to revert, and i was going to regret it if i couldn't.

week one was slow. i shipped maybe four small features. but the loop was there.

## week 2: the multiplier moment

the thing no one tells you about building with an ai coding agent is that the velocity is not linear. it's discontinuous.

for about ten days, it felt like i was building at roughly the speed of a competent solo engineer who already knew the codebase. fast, but not magic. i'd type a spec, claude would produce a diff, i'd review it, push back two or three times, merge it. 90 minutes for a feature that would take me a day.

then one afternoon in week two i wrote a single spec for the interview capture pipeline. audio ingest, transcription, diarization, structured extraction, storage, redis queue, the whole thing. four files, 900 lines of implementation, two tests passing. about 40 minutes.

that sprint, by hand, would have been a week. it's the first time i actually paused and thought: *oh. this is different.*

the multiplier wasn't the writing speed. it was the fact that i could hold the entire system in my head while the agent filled in the scaffolding underneath.

## week 3: where it broke

week three is the week every "ai-built a startup" essay leaves out.

auth broke. not in the code, exactly. in the model. i had two overlapping mental models for what a "user" was (one from the billing side, one from the team-invite side) and claude dutifully implemented both of them, halfway each. the bug surfaced three days later when someone invited a teammate and the teammate got another org's data in their dashboard.

that was a bad day.

state management got weird. we had optimistic ui in a few places where it shouldn't have been, and no optimistic ui in places where it should have been, because i'd never written a consistent spec for how the frontend should handle latency. so every feature had its own vibe.

the subtle bug that took me a full day to find was in the contradiction-detection logic. it was "working" the whole time. the outputs looked reasonable. but the scoring was drifting because of a cumulative rounding error in how we merged confidence scores across passes. no test caught it. claude certainly didn't catch it. i found it by staring at 300 rows of debug output and noticing a number that didn't match what it should have been.

this is the part that the "agents will replace engineers" people don't seem to have lived through. the agent is great at writing code. it is not great at holding a skeptical, adversarial stance toward its own output. that's still your job.

## week 4: what i had to rewrite by hand

by week four i had a pretty good sense of what claude was good at and what i had to do myself.

**what it did well.** boilerplate. api handlers. form components. migrations. anything where the shape was obvious and there was a canonical pattern. data transformations. tests, once i wrote the first one by hand and it had a model to imitate.

**what i rewrote by hand.** the pricing logic, because the implementation was correct but didn't match how i actually wanted to charge. the transcript reconciliation code, because the agent kept choosing a cleaner abstraction that cost us 300ms per call. the prompts themselves, because claude is unsurprisingly terrible at writing prompts for claude.

i'd say about 85% of the code in the repo was written by the agent, 10% was heavily edited agent output, and 5% was written by hand. but the 5% is the 5% that makes the product work.

## the real lesson

the meme is "ai will replace engineers." that's the wrong frame.

the actual shift is closer to this: a founder with clear taste and the discipline to write a good spec can now ship what used to require a team of four. the engineer doesn't disappear. the bar moves. the person who used to be a "senior engineer" is now doing the work of a "staff engineer plus tech lead plus product manager," and the agent does everything below that line.

this is great news if you're a senior engineer. you just got a very powerful set of hands. it's terrifying news if you're a junior engineer whose job was "be the extra pair of hands." that tier is disappearing. fast.

for founders, the practical implication is: you can now build much more of your own product than you used to be able to. whether you should is a different question. the answer depends on how much you trust your own taste. mine was, apparently, good enough for a month. ask me again in a year.

## the number

30,143 lines of typescript. 6,221 lines of sql migrations. 1,407 lines of tests. 4 configs. 1 founder. 1 claude.

would i do it again? yes, but differently. i'd write the auth spec twice as long. i'd commit every 20 minutes, not every 30. i'd write the first integration test before the first feature, not after. and i'd trust my skeptical instincts faster when something smelled wrong.

the code works. the product ships. that's the receipt.

but the actual work, the part that mattered, was all the stuff the agent couldn't do.
]]></content:encoded>
    </item>
  </channel>
</rss>