轻松打造高效日志系统-51CTO.COM

作为开发者，经常需要在调试时查看检查日志，缺乏日志或者不清楚如何通过日志分析问题，就无法定位出错的代码。

对于每天为成千上万甚至上百万用户提供服务的系统来说，日志必不可少，因为：

日志可以帮助我们找到影响最终用户的错误。
日志可以跟踪系统的 "健康状况"，在系统出问题之前察觉到某些 "异常迹象"。
……等等

由此可见，在开发或运行系统时，日志至关重要，因此，设计和实施完善的日志系统有助于简化监控工作。

本文将分享我在设计和构建日志系统方面的经验和理解。希望通过这篇文章，你能：

了解在操作系统中记录日志的重要性。
可以作为实施日志系统时的参考。

一、日志策略

下面列出了我们在实施日志系统之前应该问自己的问题。

Why（为什么）：日志记录的目的是什么？
Who（谁）：哪个模块将生成日志？
When（何时）：何时输出日志？
Where（哪里）：在哪里输出日志（发送到 Slack 或 BigQuery 等）
What（什么）：日志能提供什么信息？
How（如何）：如何输出日志？

二、日志级别

了解日志的目的后，应该对日志进行分级

Log level	Concept	How to handle	Example
FATAL	This Level hinder the operating of the system	Have to fix immediately	Can not connect to the DB
ERROR	Unexpected errors occur	Should be fixed as soon as you can	Can not send the email
WARN	Not an error, but are some problems like unexpected input or unexpected executing unexpected input or unexpected executing	Should be refactored regularly	Regularly delete data API
INFO	Notification when starting or ending an executing or a transaction.	Maybe outputting another needed information	Do not need to fix Output the body of the request or response
DEBUG	The information that relating to system status	Do not output in the production environment	Can be put inside a function
TRACE	Information that is more detailed than DEBUG	Do not output in the production environment

三、案例

定义日志级别后，必须明确要输出的日志类型。

本节将针对每种日志类型回答以下六个问题。

Why（为什么）
Who（谁）
When（何时）
Where（哪里）
What（什么）
How（如何）

1. 系统日志（System Log）

(1) Why：当系统出现错误时，系统日志将用于调试。

(2) Who：系统本身将输出日志。

(3) When：出错时输出日志。

(4) Where:

FATAL / ERROR：通知开发人员立即处理。
WARN / INFO：在系统或日志管理工具中输出。
DEBUG / TRACE：输出到预发环境中的 console.log。

(5) What:

FATAL / ERROR：堆栈跟踪。
WARN / INFO / DEBUG/ TRACE：要通知的内容。

(6) How:

FATAL / ERROR：通过日志管理工具或 Slack、SMS......（推模式）输出。
WARN / INFO / DEBUG / TRACE：通过日志管理工具或系统内部输出（拉模式）。

2. 访问日志（Access Log）

Why: 输出日志以跟踪发送和接收请求的过程。
Who: 系统本身或基础设施。
When: 在发送或接收请求时输出。
Where: 在 INFO 级别和拉模式中。由于日志量可能很大，必须注意查找日志的速度。
What: 输出谁、如何、何时进入系统。
How: 根据目的不同，可能会有一些差异。

3. 操作日志（Action Log）

Why: 分析用户操作，从而在此基础上改进服务。
Who: 系统本身或外部工具。
When: 某些操作发生时。
Where: 日志分析工具（BigQuery 等）。
What: 取决于目的。
How: 根据目的不同，可能会有一些差异。

4. 认证日志（Auth Log）

Why: 跟踪用户验证的输出。
Who: 系统本身。
When: 验证用户。
Where: 在 INFO 级别和拉模式中。
What: 输出认证的时间、用户、方式。
How: 根据认证方法不同，可能会有一些差异。

四、示例

概念就介绍到这里，下面来看一个示例项目。

有关代码的更多详情，请参阅Github[2]。

1. 选择日志库

我选择 log4js[3] 库，原因很简单，因为 log4js 构建日志级别的方式与我的想法一致。

2. 实施

步骤 1 - 定义日志类

首先定义日志类：

class Logger {
  public default: log4js.Logger;
  public system: log4js.Logger;
  public api: log4js.Logger;
  public access_req: log4js.Logger;
  public access_res: log4js.Logger;
  public sql: log4js.Logger;
  public auth: log4js.Logger;

  public fatal: log4js.Logger;
  public error: log4js.Logger;
  public warn: log4js.Logger;
  public info: log4js.Logger;
  public debug: log4js.Logger;
  public trace: log4js.Logger;

constructor() {
    log4js.configure(loggerConfig);

    this.system = log4js.getLogger('system');
    this.api = log4js.getLogger('api');
    this.access_req = log4js.getLogger('access_req');
    this.access_res = log4js.getLogger('access_res');
    this.sql = log4js.getLogger('sql');
    this.auth = log4js.getLogger('auth');

    this.fatal = log4js.getLogger('fatal');
    this.fatal.level = log4js.levels.FATAL;

    this.error = log4js.getLogger('error');
    this.error.level = log4js.levels.ERROR;

    this.warn = log4js.getLogger('warn');
    this.warn.level = log4js.levels.WARN;

    this.info = log4js.getLogger('info');
    this.info.level = log4js.levels.INFO;

    this.debug = log4js.getLogger('debug');
    this.debug.level = log4js.levels.DEBUG;

    this.trace = log4js.getLogger('trace');
    this.trace.level = log4js.levels.TRACE;
  }
}

在 Logger 类中定义了日志级别：

fatal
error
warn
info
debug
trace

基于此，我又定义了日志类型：

system
api
access_req
access_res
sql
auth

第 2 步 - 将 Logger 应用到项目中

将 Logger 类应用到由 NestJS[4] 框架实现的项目中。

通过 NestJS 的 Interceptor（拦截器[5]）功能，将日志类注入到项目中。

选择 Interceptor 的原因是 NestJS 拦截器不仅能封装请求流，还能封装从 API 输入和输出的响应流，因此使用拦截器是捕获请求日志和响应日志的最简单方法。我是这样定义 LoggerInterceptor 类的：

export class LoggerInterceptor implements NestInterceptor {
  intercept(
    context: ExecutionContext,
    next: CallHandler<any>
  ): Observable<any> | Promise<Observable<any>> {
    // intercept() method will "wrap" request/ response stream

    /*
     * Get request object from context
     * After that, pass request object to "requestLogger" function
     * to output the log
     */
    const request = context.switchToHttp().getRequest();
    requestLogger(request);

    /*
     * Get response object from context
     * After that pass response object to "responseLogger" & "responseErrorLogger" functions for ouputting the log or
     * error log
     */
    const response = context.switchToHttp().getResponse();

    return next.handle().pipe(
      // 200 - Success Response
      map((data) => {
        responseLogger({ requestId: request._id, response, data });
      }),
      // 4xx, 5xx - Error Response
      tap(null, (exception: HttpException | Error) => {
        try {
          responseErrorLogger({ requestId: request._id, exception });
        } catch (e) {
          logger.access_res.error(e);
        }
      })
    );
  }
}

定义了三种方法：

requestLogger: 用于记录请求信息。
responseLogger: 用于记录响应信息。
responseErrorLogger: 用于记录错误信息。

像这样：

const MaskField = {
Email: 'email',
Password: 'password',
} asconst;

type MaskField = (typeof MaskField)[keyof typeof MaskField];

const _maskFields = (object: FixType, fields: MaskField[]): FixType => {
const maskOptions = {
    maskWith: '*',
    unmaskedStartCharacters: 0,
    unmaskedEndCharacters: 0,
  };

for (let i = 0; i < fields.length; i++) {
    switch (fields[i]) {
      case MaskField.Email: {
        object[MaskField.Email] = maskData.maskEmail2(
          object[MaskField.Email],
          maskOptions
        );
      }
      case MaskField.Password: {
        object[MaskField.Password] = maskData.maskPassword(
          object[MaskField.Password],
          maskOptions
        );
      }
    }
  }

return object;
};

exportconst requestLogger = (request: Request) => {
const { ip, originalUrl, method, params, query, body, headers } = request;

// logTemplate includes: now(time), ip, http_method, url, request_object
const logTemplate = '%s %s %s %s %s';
const now = dayjs().format('YYYY-MM-DD HH:mm:ss.SSS');

const logContent = util.formatWithOptions(
    { colors: true },
    logTemplate,
    now,
    ip,
    method,
    originalUrl,
    JSON.stringify({
      method,
      url: originalUrl,
      userAgent: headers['user-agent'],
      body: _maskFields(body, [MaskField.Email, MaskField.Password]),
      params,
      query,
    })
  );

// Using access_req logger object have been defined before.
  logger.access_req.info(logContent);
};

// Ouptput success response log
exportconst responseLogger = (input: {
requestId: number;
  response: Response;
  data: any;
}) => {
const { requestId, response, data } = input;

const log: ResponseLog = {
    requestId,
    statusCode: response.statusCode,
    data,
  };

// Using access_res logger object have been defined before.
  logger.access_res.info(JSON.stringify(log));
};

// Ouptput error response log
exportconst responseErrorLogger = (input: {
requestId: number;
  exception: HttpException | Error;
}) => {
const { requestId, exception } = input;

const log: ResponseLog = {
    requestId,
    statusCode:
      exception instanceof HttpException ? exception.getStatus() : null,
    message: exception?.stack || exception?.message,
  };

// Using access_res logger object have been defined before.
  logger.access_res.info(JSON.stringify(log));
  logger.access_res.error(exception);
};

定义完 LoggerInterceptor 后，将此拦截器应用到应用程序中：

const app = await NestFactory.create(AppModule);

app.useGlobalInterceptors(new LoggerInterceptor());

在 NestJS 应用程序中应用自定义拦截器并不难，因为这是 NestJS 的内置功能。

对于 fatal 和 debug 日志，我将在用例层或基础架构层中使用，以达到以下目的：

通知无法连接数据库等致命错误。
当用户遇到问题时进行调试。

只要这样做：

logger.fatal.error('Error message');

可以将 fatal 日志输出到控制台或 Slack 等通知管道......

结果如下：

首先是访问请求日志和响应日志（当没有发生错误时）。

可以看到，与请求相关的信息，如 method、body 等都已清晰显示。

如果出错：

同时显示错误类型和错误信息。

fatal 日志会是这样的：

同样会输出错误信息和错误类型。

五、结论

本文分享了如何设计和实施一个基本的日志系统。

通过简单的示例，希望你能理解建立日志系统的重要性和必要性，这将有助于系统的运行和调试。

参考资料：

[1] Design And Building A Logging System: https://levelup.gitconnected.com/design-and-building-a-logging-system-fd5dcad110ed
[2] NewAnigram-BE-DDD: https://github.com/tuananhhedspibk/NewAnigram-BE-DDD
[3] log4js: https://github.com/log4js-node/log4js-node
[4] NestJS: https://docs.nestjs.com
[5] NestJS Interceptor: https://docs.nestjs.com/interceptors