性能对任何应用程序都至关重要。
开发者经常在慢速数据库查询上添加缓存层,这只是在掩盖症状而非解决问题。本文将挑战如何优化一个真实场景中的EF Core慢查询。EF Core提供了强大的工具,但使用不当会导致查询缓慢。
我将一步步展示如何将EF Core查询从难以接受的30秒优化到惊人的30毫秒。
我们以一个社交媒体平台为例,包含以下实体:
public class User
{
public int Id { get; set; }
public string Username { get; set; } = null!;
public ICollection<Post> Posts { get; set; } = new List<Post>();
public ICollection<Comment> Comments { get; set; } = new List<Comment>();
}
public class Post
{
public int Id { get; set; }
public string Content { get; set; } = null!;
public int UserId { get; set; }
public User User { get; set; } = null!;
public int CategoryId { get; set; }
public Category Category { get; set; } = null!;
public ICollection<Like> Likes { get; set; } = new List<Like>();
public ICollection<Comment> Comments { get; set; } = new List<Comment>();
}
public class Category
{
public int Id { get; set; }
public string Name { get; set; } = null!;
public ICollection<Post> Posts { get; set; } = new List<Post>();
}
public class Comment
{
public int Id { get; set; }
public int UserId { get; set; }
public User User { get; set; } = null!;
public int PostId { get; set; }
public Post Post { get; set; } = null!;
public string Text { get; set; } = null!;
public DateTime CreatedAt { get; set; }
}
public class Like
{
public int Id { get; set; }
public int PostId { get; set; }
public Post Post { get; set; } = null!;
}
实体关系如下:
挑战需求: 查询过去7天在".NET"分类下评论最多的5位用户,每位用户返回:
初始慢查询实现:
public List<ActiveUserDto> GetTopCommenters_Slow()
{
var since = DateTime.UtcNow.AddDays(-7);
// 1) 急切加载所有用户及其评论→帖子→分类→点赞
var users = _dbContext.Users
.Include(u => u.Comments)
.ThenInclude(c => c.Post)
.ThenInclude(p => p.Category)
.Include(u => u.Comments)
.ThenInclude(c => c.Post)
.ThenInclude(p => p.Likes)
.ToList();
var result = new List<ActiveUserDto>();
foreach (var u in users)
{
// 2) 筛选近期".NET"评论
var comments = u.Comments
.Where(c =>
c.CreatedAt >= since &&
c.Post.Category.Name == ".NET")
.ToList();
var commentsCount = comments.Count;
if (commentsCount == 0)
{
continue;
}
// 3) 按点赞数获取前3篇帖子
var topPosts = comments
.GroupBy(c => c.Post)
.Select(g => new PostDto(
g.Key.Id,
_dbContext.Likes.Count(l => l.PostId == g.Key.Id)))
.OrderByDescending(p => p.LikesCount)
.Take(3)
.ToList();
// 4) 在这些热门帖子上获取最新2条评论
var topPostIds = topPosts.Select(p => p.PostId).ToList();
var recentTexts = _dbContext.Comments
.Where(c =>
c.UserId == u.Id &&
c.CreatedAt >= since &&
topPostIds.Contains(c.PostId))
.OrderByDescending(c => c.CreatedAt)
.Take(2)
.Select(c => c.Text)
.ToList();
result.Add(new ActiveUserDto(
u.Id, u.Username,
commentsCount,
topPosts,
recentTexts));
}
// 5) 内存中筛选最终前5名
return result
.OrderByDescending(x => x.CommentsCount)
.Take(5)
.ToList();
}
此实现将所有数据急切加载到内存中,然后在客户端进行大量筛选、排序和聚合,导致从数据库传输大量数据,消耗大量内存,性能极慢。
在PostgreSQL中此查询耗时29-30秒。
我们加载了所有用户及其评论,包括那些没有在".NET"分类下评论过的用户。添加筛选条件只返回实际在".NET"分类下评论过的用户:
public List<ActiveUserDto> GetTopCommenters_Optimization1_PreFilter()
{
var since = DateTime.UtcNow.AddDays(-7);
// 只筛选有近期.NET评论的用户
var users = _dbContext.Users
.Where(u => u.Comments
.Any(c =>
c.CreatedAt >= since &&
c.Post.Category.Name == ".NET"))
.Include(u => u.Comments)
.ThenInclude(c => c.Post)
.ThenInclude(p => p.Category)
.Include(u => u.Comments)
.ThenInclude(c => c.Post)
.ThenInclude(p => p.Likes)
.ToList();
// 其余代码与慢查询相同
}
优化后节省了1秒,数据量越大效果越明显。
在筛选用户后,进一步限制只返回评论最多的5位用户:
public List<ActiveUserDto> GetTopCommenters_Optimization2_LimitUsers()
{
var since = DateTime.UtcNow.AddDays(-7);
// 在数据库中计算评论数,然后取前5
var users = _dbContext.Users
.Where(u => u.Comments
.Any(c =>
c.CreatedAt >= since &&
c.Post.Category.Name == ".NET"))
.OrderByDescending(u => u.Comments
.Count(c =>
c.CreatedAt >= since &&
c.Post.Category.Name == ".NET")
)
.Take(5)
.Include(u => u.Comments)
.ThenInclude(c => c.Post)
.ThenInclude(p => p.Category)
.Include(u => u.Comments)
.ThenInclude(c => c.Post)
.ThenInclude(p => p.Likes)
.ToList();
// 其余代码与慢查询相同
}
优化后从27秒降至17秒,性能显著提升。
只连接必要的数据表:
public List<ActiveUserDto> GetTopCommenters_Optimization3_FilterComments()
{
var since = DateTime.UtcNow.AddDays(-7);
// 只包含符合筛选条件的评论
var users = _dbContext.Users
.Include(u => u.Comments.Where(c =>
c.CreatedAt >= since &&
c.Post.Category.Name == ".NET"))
.ThenInclude(c => c.Post)
.ThenInclude(p => p.Likes)
.Where(u => u.Comments.Any())
.OrderByDescending(u => u.Comments
.Count(c =>
c.CreatedAt >= since &&
c.Post.Category.Name == ".NET")
)
.Take(5)
.ToList();
// 其余处理逻辑
}
优化后降至10秒,内存消耗减少近半。
分两阶段获取数据:
public List<ActiveUserDto> GetTopCommenters_Optimization4_Projection()
{
var since = DateTime.UtcNow.AddDays(-7);
// 获取用户及其评论数
var topUsers = _dbContext.Users
.AsNoTracking() // 只读操作使用AsNoTracking
.Where(u => u.Comments.Any(c =>
c.CreatedAt >= since &&
c.Post.Category.Name == ".NET"))
.Select(u => new
{
u.Id,
u.Username,
CommentsCount = u.Comments.Count(c =>
c.CreatedAt >= since &&
c.Post.Category.Name == ".NET")
})
.OrderByDescending(u => u.CommentsCount)
.Take(5)
.ToList();
// 其余处理逻辑
}
尽管有N+1查询,但时间从10秒降至40毫秒,共发送11次简单数据库调用。
尝试将所有筛选、计数、分组、排序和分页放入单个LINQ语句:
public List<ActiveUserDto> GetTopCommenters_Optimization5_OneQuery()
{
var since = DateTime.UtcNow.AddDays(-7);
var projected = _dbContext.Users
// 1) 筛选有.NET评论的用户
.Where(u => u.Comments
.Any(c =>
c.CreatedAt >= since &&
c.Post.Category.Name == ".NET"))
// 2) 一次性投影所有数据
.Select(u => new ActiveUserDto(
u.Id,
u.Username,
u.Comments.Count(c =>
c.CreatedAt >= since &&
c.Post.Category.Name == ".NET"),
u.Comments
.Where(c =>
c.CreatedAt >= since &&
c.Post.Category.Name == ".NET")
.GroupBy(c => c.Post)
.Select(g => new PostDto(
g.Key.Id,
g.Key.Likes.Count))
.OrderByDescending(p => p.LikesCount)
.Take(3)
.ToList(),
u.Comments
.Where(c =>
c.CreatedAt >= since &&
c.Post.Category.Name == ".NET" &&
u.Comments
.Where(d =>
d.CreatedAt >= since &&
d.Post.Category.Name == ".NET")
.GroupBy(d => d.Post)
.OrderByDescending(g => g.Key.Likes.Count)
.Take(3)
.Select(g => g.Key.Id)
.Contains(c.PostId))
.OrderByDescending(c => c.CreatedAt)
.Take(2)
.Select(c => c.Text)
.ToList()
))
.OrderByDescending(x => x.CommentsCount)
.Take(5)
.ToList();
return projected;
}
相比11次小查询,单查询仅快3毫秒,但SQL语句复杂且存在笛卡尔积爆炸风险。
当投影深层对象图时,EF Core常生成包含多个JOIN的大型SQL,导致数据重复和性能下降。使用.AsSplitQuery()将大查询拆分为多个简单SQL语句:
public List<ActiveUserDto> GetTopCommenters_Optimization6_SplitQuery()
{
var since = DateTime.UtcNow.AddDays(-7);
var projected = _dbContext.Users
// 与优化5相同代码
.AsSplitQuery()
.ToList();
return projected;
}
结果出乎意料:耗时51毫秒,比11次查询的方案更慢。
分三个阶段获取数据:
public List<ActiveUserDto> GetTopCommenters_Optimization7_ThreePhaseOptimized()
{
var since = DateTime.UtcNow.AddDays(-7);
// 阶段1:获取前5名用户
var topUsers = _dbContext.Users
.Where(u => u.Comments.Any(c =>
c.CreatedAt >= since &&
c.Post.Category.Name == ".NET"))
.Select(u => new
{
u.Id,
u.Username,
CommentsCount = u.Comments.Count(c =>
c.CreatedAt >= since &&
c.Post.Category.Name == ".NET")
})
.OrderByDescending(x => x.CommentsCount)
.Take(5)
.ToList();
// 阶段2:获取热门帖子
var topPostsPerUser = _dbContext.Comments
.Where(c =>
userIds.Contains(c.UserId) &&
c.CreatedAt >= since &&
c.Post.Category.Name == ".NET")
.GroupBy(c => new { c.UserId, c.PostId })
.Select(g => new { g.Key.UserId, g.Key.PostId, LikesCount = g.First().Post.Likes.Count })
.ToList()
.GroupBy(x => x.UserId)
.ToDictionary(
g => g.Key,
g => g.OrderByDescending(x => x.LikesCount)
.Take(3)
.Select(x => new PostDto(x.PostId, x.LikesCount))
.ToList()
);
// 阶段3:获取最新评论
var recentCommentsPerUser = _dbContext.Comments
.Where(c =>
userIds.Contains(c.UserId) &&
c.CreatedAt >= since &&
allTopPostIds.Contains(c.PostId))
.OrderByDescending(c => c.CreatedAt)
.Select(c => new { c.UserId, c.Text, c.CreatedAt })
.ToList()
.GroupBy(c => c.UserId)
.ToDictionary(
g => g.Key,
g => g.OrderByDescending(c => c.CreatedAt)
.Take(2)
.Select(c => c.Text)
.ToList()
);
// 合并最终结果
return topUsers
.Select(u => new ActiveUserDto(
u.Id,
u.Username,
u.CommentsCount,
topPostsPerUser.TryGetValue(u.Id, out var posts) ? posts : [],
recentCommentsPerUser.TryGetValue(u.Id, out var comments) ? comments : []
))
.ToList();
}
优化后降至31毫秒。
分两个阶段:
public List<ActiveUserDto> GetTopCommenters_Optimization8_TwoPhaseOptimized()
{
var since = DateTime.UtcNow.AddDays(-7);
// 阶段1:获取用户摘要
var summaries = _dbContext.Users
.Where(u => u.Comments
.Any(c =>
c.CreatedAt >= since &&
c.Post.Category.Name == ".NET"))
.Select(u => new
{
u.Id,
u.Username,
CommentsCount = u.Comments
.Count(c =>
c.CreatedAt >= since &&
c.Post.Category.Name == ".NET"),
TopPosts = u.Comments
.Where(c =>
c.CreatedAt >= since &&
c.Post.Category.Name == ".NET")
.GroupBy(c => c.PostId)
.Select(g => new
{
PostId = g.Key,
LikesCount = _dbContext.Likes
.Count(l => l.PostId == g.Key)
})
.OrderByDescending(p => p.LikesCount)
.Take(3)
.ToList()
})
.OrderByDescending(x => x.CommentsCount)
.Take(5)
.ToList();
// 阶段2:获取最新评论
var recentCommentsLookup = _dbContext.Comments
.Where(c =>
c.CreatedAt >= since &&
c.Post.Category.Name == ".NET" &&
userIds.Contains(c.UserId) &&
postIds.Contains(c.PostId))
.GroupBy(c => new { c.UserId, c.PostId })
.Select(g => new
{
g.Key.UserId,
g.Key.PostId,
LatestTwo = g
.OrderByDescending(c => c.CreatedAt)
.Take(2)
.Select(c => c.Text)
.ToList()
})
.ToList()
.ToLookup(x => x.UserId, x => x);
// 组合最终DTO
return summaries
.Select(s => new ActiveUserDto(
s.Id,
s.Username,
s.CommentsCount,
s.TopPosts
.Select(tp => new PostDto(tp.PostId, tp.LikesCount))
.ToList(),
recentCommentsLookup[s.Id]
.SelectMany(x => x.LatestTwo)
.Take(2)
.ToList()
))
.ToList();
}
优化后稳定在29-30毫秒,是目前最快的方案。
| 优化方案 | 执行时间 | 内存消耗 | |---------|---------|---------| | 原始方案 | 30秒 | - | | 优化1 | 29秒 | - | | 优化2 | 17秒 | - | | 优化3 | 10秒 | - | | 优化4 | 40毫秒 | - | | 优化5 | 38毫秒 | 104KB | | 优化6 | 51毫秒 | - | | 优化7 | 31毫秒 | - | | 优化8 | 30毫秒 | 228KB |
优化8最快(~30毫秒)但消耗~228KB内存;优化5("单查询")耗时~38毫秒(慢9毫秒)但仅分配~104KB,内存消耗不到优化8的一半。
实践中建议从优化5开始,它在速度、内存和复杂度间取得了最佳平衡。随着数据增长,可考虑切换到多阶段方案(优化7/8)。
本次EF Core查询优化挑战中,我们将查询从30秒优化到约30毫秒。关键收获:
专业建议:考虑为查询中频繁访问的字段添加数据库索引以进一步提升性能。
记住黄金法则:先测量,再优化。通常你会发现查询已经足够快,无需优化。使用日志或性能分析器识别实际瓶颈,应用本文的针对性改进,并验证每次优化的效果。