How I Work
Operating principles, distilled across eight years of designing operating systems
Most case studies show what I built. This page is the how — the patterns I now reach for, the moves I default to, and the calls I trust myself to make when the situation is ambiguous.
Six Principles
These aren't aspirations. They're patterns that have shown up across enough projects that I now treat them as defaults. Each is illustrated with where it surfaces in my work.
1. Trust is built in transparent breakdowns.
Whenever a system makes a decision that affects someone's livelihood, dignity, or experience, the system must show its work. Not just the result. The math, the criteria, the reasoning, in a single screen they can audit themselves.
This is non-negotiable for me now. If I can't make the breakdown legible, I haven't designed the system properly yet.
Where this shows up:
QC Audit → CMs see exactly why each audit scored what it did
(+50 for valid food, +30 for meaningful caption, etc.)
Caregiver OS → Payslips break down every adjustment, every bonus,
every penalty, every overtime payment with reason
Tipping flow → Total breakdown visible before payment;
admin fee surfaced explicitly, not buried
Notification Board → Every notification shows who triggered it,
when, and what action it expects
The pattern: opacity erodes trust faster than any feature can build it. If your AI scores poorly, surface why. If a paycheck is short, show the reason on the screen, not in a chat with HR.
2. Augment, don't replace.
The temptation with AI is to remove humans entirely. The better move is almost always to design AI as a colleague that elevates human judgment.
The human still decides. The AI does the work that wasn't worth a human's time anyway. Same headcount, dramatically different output.
Where this shows up:
AI Meal Log → AI estimates calories instantly; coach focuses
on behavioral coaching, not arithmetic
QC Audit → AI scores audits; CM reviews exceptions and
corrects edge cases that train the next month
Monthly Wrapped → AI generates content; CM approves and edits;
edits become training signal
Auto-matching → System recommends caregiver matches;
Ops confirms with override capability
I now treat fully-autonomous AI as a graduation, not a starting point. Phase 1 ships with full human review. Phase 2 with sampling. Phase 3 with exceptions only. Trust compounds. So does training data.
3. The screenshot is the product.
Some of the most leveraged design opportunities live in things people are already doing manually. Watch where users are screenshotting, copy-pasting, forwarding to WhatsApp, reformatting in Canva. That manual work is a design brief written by behavior.
Where this shows up:
Monthly Wrapped → Parents were already screenshotting MPC decks.
The screenshot was the product. We just gave
them the real version.
Share as Frame → Parents were already saving timeline photos
to share. We added the brand frame.
Referral System → Parents were already manually telling friends.
We gave the recommendation a tracked link.
I look for this pattern everywhere now. What are users translating, on their own, with friction, to make our product fit their life? That translation work is almost always a design brief in disguise.
4. User-driven migrations beat bulk cutovers.
When a system has to move from one substrate to another, the default move is the wrong one. Bulk migrations look efficient on paper. They break in production.
Far better: two paths, one destination. Legacy users migrate at their own pace on next interaction. New users skip the legacy system entirely. Both paths converge. The legacy system runs until the last user crosses over — and only then is decommissioned.
Where this shows up:
Account Unification → "Perbarui Account" flow on next login;
new users skip Supabase entirely. No
migration weekend. No cutover moment.
I now apply this to any project where production data has to move. Slower than a bulk migration. Far more reliable.
5. Budget for the dip.
Any system designed to get worse before it gets better will look like failure on the dashboard at day 30. The discipline isn't to avoid the dip — it's to predict it explicitly, name what would constitute real failure (vs the expected discomfort), and hold the line until the leading indicators tell you to either keep going or stop.
Where this shows up:
Caregiver Hiring → Lead inflow cratered to −93% at 30 days
before recovering to +435% at 6 months.
Predicted curve held. Stakeholder anxiety
didn't.
Notification Board → Missed billing held flat for first 7 days.
Improvement compounded every cycle after.
Linear thinking would have called the
product a failure on day 7.
The lesson generalized: for any system designed to get worse before better, the predicted curve is itself a deliverable. Ship the curve before you ship the system. Stakeholder anxiety drops dramatically when the dip was on the plan.
6. Free-text is invisible to the operating system.
When you let people type whatever they want into a "reason" or "note" field, you've made the data invisible. It can't be queried, compared, trended, or improved. It looks like flexibility. It's actually opacity.
Structured taxonomies look like more work up front. They pay back forever in operational intelligence.
Where this shows up:
Account Unification → Resignation reasons are a multi-select
taxonomy, not free text. HR can ask
"how many candidates resigned during
psikotest" and get a real answer.
Notification Board → Two authoring paths (automated vs config)
with structured event types and
deterministic click-through routes.
QC Audit → Validation criteria are explicit and
weighted, not subjective.
Caregiver OS → Every payroll adjustment carries a
reason from a defined taxonomy.
Structured data is the difference between a system that runs a business and a system that records one. Most companies confuse the two.
Three Things I'm Skeptical Of
The flip side of strong principles is strong skepticism. These are the moves I now reach for less, despite their popularity.
"Just add a notification." Adding a channel almost never solves a fragmentation problem. The board model — persistent, queryable, authoritative — is almost always the right move. (See: Notification Board)
"Let's run more research before we decide." Sometimes. But often, the research is being run because nobody wants to make the decision. After a certain point, additional interviews don't sharpen the answer; they just delay it. Make the decision. Watch what happens. Iterate.
"We'll fix it with a database expansion." This one's specifically from my Sirka era. Stakeholders kept assuming the meal log problem was "the database isn't big enough." It wasn't. Users wanted instant feedback, not more options. Whenever the proposed fix is "more of the existing thing," I now pause and ask whether the underlying model is the actual problem.
What I Look for in Problems
When I'm choosing where to spend my time, I look for problems with these properties:
The constraint is structural, not marginal. I'm not interested in 5% improvements. I'm interested in problems where the current model can't reach the target no matter how well it's executed. (See: Sirka's 1:60 → 1:105 ratio. Couldn't be optimized into. Had to be redesigned around.)
The breakage is invisible. The most leveraged work is in places where the failure mode looks like background noise — small frictions that compound, fragmented information that nobody owns, manual work that nobody questions because "that's just how it is." (See: Notification Board. Missed billing was a noise pattern, not a clear bug.)
The fix is upstream, not in-screen. I'm wary of design problems that turn out to be PM problems, ops problems, or strategy problems disguised as design briefs. I lean into briefs that explicitly require changing the substrate, not the surface.
There's a real consequence to getting it right. Compensation, identity, payslips, audit, capacity. These are systems where the math has to be right because someone's livelihood depends on the math being right. That's where I do my best work.
What I'm Still Working On
Principal-level honesty includes the things I haven't figured out yet.
Sequencing AI feasibility against design. I still default to designing first and validating AI capabilities second. That worked for QC Audit, partially. It would have worked better the other way. I'm trying to internalize "ship the SPIKE first" as default for any AI-integrated system.
Communicating dips without losing the room. I held the line on Caregiver Hiring during the first-30-days dip, but the stakeholder conversations were harder than they needed to be. I'm getting better at writing predicted curves before launch, but it's still my weakest discipline.
Knowing when to ship the V1 vs polish to V1.5. I sometimes over-iterate on quality at the expense of shipping speed. The Caregiver OS is the clearest example — I could have shipped Payslip Transparency 6 weeks earlier than I did. Most of those 6 weeks were spent on details that haven't materially changed retention.
Handing off systems I built. I've shipped a lot at MainStory. The next phase is making sure the systems are legible and ownable by people who didn't build them. That's a different skill from designing them in the first place.
How This Translates to How I Work With Teams
I work best when:
- The brief is set at the strategic layer. "Reduce salary disputes" is a brief I can do something with. "Redesign the payslip" is too narrow.
- Engineering is involved early. I write specs, not just Figma files. The spec is the artifact engineering builds against. We collaborate on the model before the screen.
- Ops gets to push back. Operators always know things I don't. The blueprint is sharper for their input. The design is more honest for their objection.
- There's a metric. I want to know what the system is supposed to do, in numbers, before I draw a single screen. Then we can argue about whether the design will get there.
- Iteration is allowed post-launch. Most of my best calls are visible only in retrospect. I want the freedom to ship V1, watch, learn, and ship V2 without re-justifying the entire premise.
A Note on Scope
I've spent the last few years working at one company across multiple deeply different surface areas — operations, workforce, growth, identity, service blueprints. That's unusual.
What it has taught me is that design principles are more general than design contexts. The patterns that worked for replacing a Retool dashboard with AI also worked for redesigning compensation, also worked for migrating identity systems. The substrate changed. The thinking didn't.
So when I think about my next role, I think less about industry vertical and more about whether the work is at the operating-system layer. If it is — if the company needs someone who can architect systems that affect how people get paid, how products get delivered, how teams stay coordinated — that's the work I'm built for.
Amalia Nurul Balqis · Principal-track Product Designer · Jakarta · 2026