Streaming HTML with React Suspense: Reduce TTFB by 40%
React Suspense with streaming SSR lets you send HTML progressively instead of waiting for all data. Here's how to implement it in Next.js for dramatically faster page loads.
Traditional server-side rendering has a fundamental bottleneck: the server must fetch all data before sending any HTML. If your page has a fast header query (5ms) and a slow recommendations query (800ms), the user waits 800ms for everything — including the header they could have seen 795ms earlier.
Streaming HTML with React Suspense eliminates this bottleneck. The server sends HTML progressively as data becomes available. The header renders immediately. The recommendations stream in when ready. Time to First Byte drops dramatically.
How Streaming Works in Next.js
In the App Router, streaming is built into Server Components. When you wrap a component in <Suspense>, Next.js streams the fallback immediately and replaces it with the real content when the async operation completes.
// app/blog/[slug]/page.tsx
import { Suspense } from "react";
import { ArticleContent } from "@/components/blog/article-content";
import { CommentSection } from "@/components/blog/comment-section";
import { RelatedPosts } from "@/components/blog/related-posts";
import { CommentSkeleton, RelatedSkeleton } from "@/components/skeletons";
export default async function BlogPostPage({
params,
}: {
params: Promise<{ slug: string }>;
}) {
const { slug } = await params;
return (
<article>
{/* This streams first — fast DB query */}
<Suspense fallback={<ArticleSkeleton />}>
<ArticleContent slug={slug} />
</Suspense>
{/* Comments stream independently — may be slow */}
<Suspense fallback={<CommentSkeleton />}>
<CommentSection slug={slug} />
</Suspense>
{/* Related posts stream independently */}
<Suspense fallback={<RelatedSkeleton />}>
<RelatedPosts slug={slug} />
</Suspense>
</article>
);
}
Each <Suspense> boundary is an independent streaming unit. The server sends the shell and fast sections first, then progressively fills in slower sections. The user sees content appearing in stages instead of a blank screen followed by everything at once.
The loading.tsx Convention
Next.js provides a file-based way to add Suspense boundaries at the route level. Creating a loading.tsx file wraps the entire page in Suspense automatically:
// app/blog/[slug]/loading.tsx
export default function BlogPostLoading() {
return (
<div className="max-w-3xl mx-auto px-4 py-8">
{/* Skeleton matches the layout of the actual page */}
<div className="h-8 w-3/4 bg-muted animate-pulse rounded mb-4" />
<div className="flex items-center gap-3 mb-8">
<div className="h-10 w-10 bg-muted animate-pulse rounded-full" />
<div className="h-4 w-32 bg-muted animate-pulse rounded" />
</div>
<div className="space-y-3">
{Array.from({ length: 8 }).map((_, i) => (
<div key={i} className="h-4 bg-muted animate-pulse rounded"
style={{ width: }} />
))}
</div>
</div>
);
}
But for optimal streaming, use granular <Suspense> boundaries inside the page instead of a single loading.tsx. This way, fast sections render immediately rather than waiting behind a page-level skeleton.
Measuring the Impact
The metrics that improve with streaming:
- Time to First Byte (TTFB): The server sends the initial HTML shell immediately instead of waiting for all data. We measured a 40% TTFB reduction on pages with mixed-speed data sources.
- First Contentful Paint (FCP): The browser can start rendering the shell and fast components while slower ones are still loading.
- Largest Contentful Paint (LCP): If your LCP element (hero image, main heading) is in a fast data path, it renders much earlier.
When Not to Stream
Streaming adds complexity. Skip it when:
- All data sources are fast (<50ms) — the overhead isn't worth it
- The page is fully static — use ISR or static generation instead
- SEO requires complete HTML on initial response — most crawlers handle streaming now, but verify for your target search engines
Advanced: Parallel Data Fetching
Combine streaming with parallel data fetching for maximum performance. Instead of sequential awaits, kick off all fetches simultaneously and let Suspense handle the resolution order:
// Each component fetches its own data independently
// They resolve in parallel, not sequentially
async function ArticleContent({ slug }: { slug: string }) {
const article = await getArticleBySlug(slug); // 20ms
return <div>{article.content}</div>;
}
async function CommentSection({ slug }: { slug: string }) {
const comments = await getCommentsBySlug(slug); // 200ms
return <div>{comments.map(c => <Comment key={c.id} {...c} />)}</div>;
}
async function RelatedPosts({ slug }: { slug: string }) {
const related = await getRelatedPosts(slug); // 150ms
return <div>{related.map(p => <PostCard key={p.id} {...p} />)}</div>;
}
// Total time: ~200ms (parallel) instead of 370ms (sequential)
Takeaways
- Wrap slow data-fetching components in
<Suspense>for progressive streaming - Use granular Suspense boundaries instead of page-level
loading.tsxfor optimal UX - Design skeletons that match actual layout to prevent layout shift
- Combine streaming with parallel data fetching for maximum throughput
- Measure TTFB and FCP to quantify the improvement — expect 30–50% gains on data-heavy pages
Admin
Cal.com
Open source scheduling — tự host booking system, thay thế Calendly. Free & privacy-first.
Bình luận (0)
Đăng nhập để bình luận
Chưa có bình luận nào. Hãy là người đầu tiên!