C++中glog源碼剖析以及如何設計一個高效 log模塊
掃描二維碼
隨時隨地手機看文章
每個開發(fā)者編程中都會記錄log信息,多數(shù)人都會使用log第三方庫,log庫使用起來很方便,但我們也需要了解log系統(tǒng)的原理,這里以glog為例進行分析。
開始
這里不會介紹glog中是如何控制INFO、ERROR等級別的輸出的,其實就是一個宏控制,主要介紹google glog中一次LOG(INFO)過程中究竟發(fā)生了什么,以及為什么glog是線程安全的。
glog中的LOG(INFO)其實是一個宏定義,如下:
#define LOG(severity) COMPACT_GOOGLE_LOG_ ## severity.stream() 那么LOG(INFO)相當于 COMPACT_GOOGLE_LOG_INFO.stream()
而COMPACT_GOOGLE_LOG_INFO也是一個宏,如下:
#define COMPACT_GOOGLE_LOG_INFO google::LogMessage(__FILE__, __LINE__) 那么LOG(INFO)相當于 COMPACT_GOOGLE_LOG_INFO.stream()也就相當于 google::LogMessage(__FILE__, __LINE__).stream()
這里就構造了一個google::LogMessage的臨時對象,語句執(zhí)行完就會自動析構,google::LogMessage的大體結構如下:
// 這里是摘下來的極簡版,實際比這復雜很多
class LogMessage {
LogMessageData::LogStream stream_; // 自定義的輸出流,繼承的std::ostream
LogMessage(const char* file, int line);
~LogMessage(); // 在析構函數(shù)中有大量邏輯,后面講解
std::ostream& stream(); // 返回stream_
void SendToLog(); // 主要函數(shù)
void Init(const char* file, int line, LogSeverity severity,
void (LogMessage::*send_method)());
void Flush();
LogMessageData* allocated_; // LogMessageData,主要的結構體
LogMessageData* data_;
};
LogMessage的構造函數(shù)如下:
LogMessage::LogMessage(const char* file, int line)
: allocated_(NULL) {
Init(file, line, GLOG_INFO, &LogMessage::SendToLog);
}
Init函數(shù)精簡版如下:
void LogMessage::Init(const char* file,
int line,
LogSeverity severity,
void (LogMessage::*send_method)()) {
allocated_ = NULL;
// 主要看這里,glog比較高效的地方就在于它沒有頻繁的申請內存,而是使用線程thread_local特性,為每個線程創(chuàng)建一塊私有內存,同一線程頻繁的使用這塊內存構造LogMessageData的對象,因為這個LogMessageData對象是臨時對象,每次都會立刻析構,這樣下一個對象可以重復在這塊內存地址來構造對象
if (thread_data_available) {
thread_data_available = false;
// 這里會將LogMessageData對象構造在內存對齊的地址上,HAVE_ALIGNED_STORAGE這個宏在C++11后會有效,C++11前就會走到這個#else分支,需要自己進行手動內存對齊,為什么HAVE_ALIGNED_STORAGE宏下面的代碼可以直接構造對象,下面會介紹
#ifdef HAVE_ALIGNED_STORAGE
data_ = new (&thread_msg_data) LogMessageData;
#else
const uintptr_t kAlign = sizeof(void*) - 1;
char* align_ptr =
reinterpret_cast(reinterpret_cast(thread_msg_data + kAlign) & ~kAlign);
data_ = new (align_ptr) LogMessageData;
assert(reinterpret_cast(align_ptr) % sizeof(void*) == 0);
#endif
} else {
allocated_ = new LogMessageData();
data_ = allocated_;
}
// 下面這一堆代碼就是在每一行l(wèi)og具體信息前加上一些信息前綴,時間戳線程號文件名行數(shù)等等
stream().fill('0');
data_->preserved_errno_ = errno;
data_->severity_ = severity;
data_->line_ = line;
data_->send_method_ = send_method;
data_->sink_ = NULL;
data_->outvec_ = NULL;
WallTime now = WallTime_Now();
data_->timestamp_ = static_cast(now);
localtime_r(&data_->timestamp_, &data_->tm_time_);
int usecs = static_cast((now - data_->timestamp_) * 1000000);
data_->num_chars_to_log_ = 0;
data_->num_chars_to_syslog_ = 0;
data_->basename_ = const_basename(file);
data_->fullname_ = file;
data_->has_been_flushed_ = false;
// If specified, prepend a prefix to each line. For example:
// I1018 160715 f5d4fbb0 logging.cc:1153]
// (log level, GMT month, date, time, thread_id, file basename, line)
// We exclude the thread_id for the default thread.
if (FLAGS_log_prefix && (line != kNoLogPrefix)) {
stream() << LogSeverityNames[severity][0]
<< setw(2) << 1+data_->tm_time_.tm_mon
<< setw(2) << data_->tm_time_.tm_mday
<< ' '
<< setw(2) << data_->tm_time_.tm_hour << ':'
<< setw(2) << data_->tm_time_.tm_min << ':'
<< setw(2) << data_->tm_time_.tm_sec << "."
<< setw(6) << usecs
<< ' '
<< setfill(' ') << setw(5)
<< static_cast(GetTID()) << setfill('0')
<< ' '
<< data_->basename_ << ':' << data_->line_ << "] ";
}
data_->num_prefix_chars_ = data_->stream_.pcount();
}
為什么HAVE_ALIGNED_STORAGE宏下面的代碼可以直接構造對象呢?
看這里
#ifdef HAVE_ALIGNED_STORAGE static GLOG_THREAD_LOCAL_STORAGE std::aligned_storage是因為C++11后可以直接使用std::aligned_storage來申請對齊的內存地址。
LOG(INFO)<<”xxx”<<”yyy”中,glog是如何收集到xxx和yyy數(shù)據的呢?
我們上面看到LogMessage的構造函數(shù)中會構造核心的LogMessageData對象,看如下LogMessageData的結構體和構造函數(shù)中的注釋
struct LogMessage::LogMessageData { LogMessageData(); int preserved_errno_; // preserved errno // Buffer space; contains complete message text. char message_text_[LogMessage::kMaxLogMessageLen+1]; // 這里存放log的具體信息 LogStream stream_; // std::ostream char severity_; // What level is this LogMessage logged at? int line_; // line number where logging call is. void (LogMessage::*send_method_)(); // Call this in destructor to send union { // At most one of these is used: union to keep the size low. LogSink* sink_; // NULL or sink to send message to std::vector* outvec_; // NULL or vector to push message onto std::string* message_; // NULL or string to write message into }; time_t timestamp_; // Time of creation of LogMessage struct ::tm tm_time_; // Time of creation of LogMessage size_t num_prefix_chars_; // # of chars of prefix in this message size_t num_chars_to_log_; // # of chars of msg to send to log size_t num_chars_to_syslog_; // # of chars of msg to send to syslog const char* basename_; // basename of file that called LOG const char* fullname_; // fullname of file that called LOG bool has_been_flushed_; // false => data has not been flushed bool first_fatal_; // true => this was first fatal msg private: LogMessageData(const LogMessageData&); void operator=(const LogMessageData&); }; LogMessage::LogMessageData::LogMessageData() : stream_(message_text_, LogMessage::kMaxLogMessageLen, 0) { } // 這里就不列出LogStream的構造函數(shù)啦,LogStream內部會使用到streambuf,而這個streambuf的地址就是上面LogMessageData構造函數(shù)中傳入的message_text_地址,這樣stream_接收到的所有的數(shù)據都會存到這個message_text_中.將stream接收到的數(shù)據都存到了messagetext中,那數(shù)據是如何寫到文件中去的并且是線程安全的呢?
這些邏輯都是在LogMessage的析構函數(shù)中,看精簡代碼:
LogMessage::~LogMessage() { Flush();// 主要功能在Flush函數(shù)中 if (data_ == static_cast(&thread_msg_data)) { data_->~LogMessageData(); thread_data_available = true; } else { delete allocated_; } } Flush函數(shù)精簡:
void LogMessage::Flush() { { MutexLock l(&log_mutex);// 注意,這里使用了全局log鎖,所以寫操作等等都是線程安全的,即沒有出現(xiàn)日志錯亂現(xiàn)象 (this->*(data_->send_method_))(); // 這里的send_method_可以從上面Init函數(shù)中看見其實就是使用SendToLog() ++num_messages_[static_cast(data_->severity_)]; } }SendToLog函數(shù)精簡:
void LogMessage::SendToLog() { // LogDestination類中有好多靜態(tài)函數(shù),LogToAllLogfiles和MaybeLogToStderr是兩個常用的函數(shù),一個寫到文件中,一個寫到輸出流中, data->severity_這里指的是log級別,即INFO、WARNING或者ERROR等 LogDestination::LogToAllLogfiles(data_->severity_, data_->timestamp_, data_->message_text_, data_->num_chars_to_log_); LogDestination::MaybeLogToStderr(data_->severity_, data_->message_text_, data_->num_chars_to_log_); }LogDestination::LogToAllLogfiles函數(shù)精簡:
inline void LogDestination::LogToAllLogfiles(LogSeverity severity, time_t timestamp, const char* message, size_t len) { if ( FLAGS_logtostderr ) { // global flag: never log to file ColoredWriteToStderr(severity, message, len); } else { // 看這里,glog會把高級別的log也會寫在低級別的log文件中,即ERROR的log也會出現(xiàn)在INFO的log文件中 for (int i = severity; i >= 0; --i) LogDestination::MaybeLogToLogfile(i, timestamp, message, len); } }LogDestination::MaybeLogToLogfile函數(shù)精簡:
inline void LogDestination::MaybeLogToLogfile(LogSeverity severity, time_t timestamp, const char* message, size_t len) { const bool should_flush = severity > FLAGS_logbuflevel; // 這里每個log level都會有一個LogDestination的對象指針 LogDestination* destination = log_destination(severity); // Write就是寫文件操作 destination->logger_->Write(should_flush, timestamp, message, len); }glog中日志寫到文件中是什么時候Flush到磁盤中的呢?
看下destination->logger_->Write函數(shù)精簡實現(xiàn):
void LogFileObject::Write(bool force_flush, time_t timestamp, const char* message, int message_len) { MutexLock l(&lock_); // We don't log if the base_name_ is "" (which means "don't write") if (base_filename_selected_ && base_filename_.empty()) { return; } // If there's no destination file, make one before outputting // 沒有創(chuàng)建文件那就創(chuàng)建一個文件,文件名中包含時間進程號等信息 if (file_ == NULL) { // Try to rollover the log file every 32 log messages. The only time // this could matter would be when we have trouble creating the log // file. If that happens, we'll lose lots of log messages, of course! if (++rollover_attempt_ != kRolloverAttemptFrequency) return; rollover_attempt_ = 0; struct ::tm tm_time; localtime_r(×tamp, &tm_time); // The logfile's filename will have the date/time & pid in it ostringstream time_pid_stream; time_pid_stream.fill('0'); time_pid_stream << 1900+tm_time.tm_year << setw(2) << 1+tm_time.tm_mon << setw(2) << tm_time.tm_mday << '-' << setw(2) << tm_time.tm_hour << setw(2) << tm_time.tm_min << setw(2) << tm_time.tm_sec << '.' << GetMainThreadPid(); const string& time_pid_string = time_pid_stream.str(); if (!CreateLogfile(time_pid_string)) { perror("Could not create log file"); fprintf(stderr, "COULD NOT CREATE LOGFILE '%s'!\n", time_pid_string.c_str()); return; } // 在文件的最開頭寫入log創(chuàng)建的相應信息 // Write a header message into the log file ostringstream file_header_stream; file_header_stream.fill('0'); file_header_stream << "Log file created at: " << 1900+tm_time.tm_year << '/' << setw(2) << 1+tm_time.tm_mon << '/' << setw(2) << tm_time.tm_mday << ' ' << setw(2) << tm_time.tm_hour << ':' << setw(2) << tm_time.tm_min << ':' << setw(2) << tm_time.tm_sec << '\n' << "Running on machine: " << LogDestination::hostname() << '\n' << "Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu " << "threadid file:line] msg" << '\n'; const string& file_header_string = file_header_stream.str(); const int header_len = file_header_string.size(); fwrite(file_header_string.data(), 1, header_len, file_); file_length_ += header_len; bytes_since_flush_ += header_len; } // Write to LOG file if ( !stop_writing ) { // fwrite() doesn't return an error when the disk is full, for // messages that are less than 4096 bytes. When the disk is full, // it returns the message length for messages that are less than // 4096 bytes. fwrite() returns 4096 for message lengths that are // greater than 4096, thereby indicating an error. errno = 0; // 開始把數(shù)據寫到log文件中 fwrite(message, 1, message_len, file_); if ( FLAGS_stop_logging_if_full_disk && errno == ENOSPC ) { // disk full, stop writing to disk stop_writing = true; // until the disk is return; } else { file_length_ += message_len; bytes_since_flush_ += message_len; } } else { if ( CycleClock_Now() >= next_flush_time_ ) stop_writing = false; // check to see if disk has free space. return; // no need to flush } // See important msgs *now*. Also, flush logs at least every 10^6 chars, // or every "FLAGS_logbufsecs" seconds. // 控制什么時候把文件內容Flush到磁盤中 if ( force_flush || (bytes_since_flush_ >= 1000000) || (CycleClock_Now() >= next_flush_time_) ) { FlushUnlocked(); } } // 這里是FlushUnlocked函數(shù) void LogFileObject::FlushUnlocked(){ if (file_ != NULL) { fflush(file_); bytes_since_flush_ = 0; } // Figure out when we are due for another flush. const int64 next = (FLAGS_logbufsecs * static_cast(1000000)); // in usec next_flush_time_ = CycleClock_Now() + UsecToCycles(next); }從這里可以看到glog是根據內容長度和時間來控制是否Flush到磁盤中,有1000000字節(jié)沒有被Flush就執(zhí)行一次Flush操作,或者FLAGS_logbufsecs秒沒有Flush也會執(zhí)行一次Flush,這個FLAGS_logbufsecs默認是30,可以設置。
總結
glog通過重寫std::ostream、std::stream_buf以及thread_local等技術實現(xiàn)了精簡高效的日志輸出功能,每行l(wèi)og語句都會創(chuàng)建一個LogMessage對象,通過LogMessage對象內部的stream收集需要輸出的消息存到對象內部的棧內存message_text中,一行語句結束,LogMessage對象析構,析構時會把message_text中的數(shù)據寫到文件和控制臺里,寫文件操作會每隔1000000字節(jié)或者每隔FLAGS_logbufsecs秒Flush到磁盤中。
如何設計一個高效的log模塊?
從上述源碼分析可知,glog是每次log都會執(zhí)行寫操作,并且寫操作是等鎖的,寫文件本身就比較耗時,再加上等鎖的時間,會阻塞當前寫log的業(yè)務工作線程,所以glog在多線程中會導致應用程序性能不是特別好,所以如果能夠減少阻塞工作線程的時間就可以設計出一個高效的log模塊,將日志的寫文件操作放在單獨的線程中,參考陳碩muduo代碼,log架構其實有很大改良空間。
muduo async log日志邏輯
muduo的異步日志是將寫日志的操作放在單獨的日志線程中,這里分為多個應用線程和專用的日志線程,同時有多塊緩存,大概可以分為兩大塊緩存池,有收集日志的緩存池和專用于寫日志的緩存池,收集日志的緩存池(buffer_vector)中有兩塊buffer,稱為current_buffer和next_buffer,多個應用線程的日志都會寫到current_buffer(buffer_mutex)中,當current_buffer滿的時候,將current_buffer的指針存到buffer_vector中,同時current_buffer的指針指向next_buffer,這樣應用線程可以繼續(xù)寫日志到current_buffer中,current_buffer的指針存到buffer_vector后,會通知到日志線程,這里加上鎖來控制current_buffer(buffer_mutex),寫日志的緩存池叫write_buffer_vector,里面也有兩塊緩存newBuffer1和newBuffer2,這時再將current_buffer的指針存入buffer_vector中,這時buffer_vector中有兩塊緩存的指針,之后將buffer_vector和write_buffer_vector交換,buffer_vector就是空,同時current_buffer指針指向newBuffer1,next_buffer指針指向newBuffer2,釋放鎖(buffer_mutex),這時log線程可以進行寫操作,write_buffer_vector的大小為2,將里面的兩塊內存都寫到文件中,同時newBuffer1和newBuffer2指針分別指向這兩塊內存,這樣下次再執(zhí)行交換操作時候write_buffer_vector和newBuffer1和newBuffer2都是空,一直循環(huán)執(zhí)行這類操作,log一般都是寫文件時候時間比較長,將數(shù)據memcpy到buffer中耗時較少,這樣可以大幅減少等鎖的時間,提升log的性能。
代碼如下:void AsyncLogging::append(const char* logline, int len) { muduo::MutexLockGuard lock(mutex_); if (currentBuffer_->avail() > len) { currentBuffer_->append(logline, len); } else { buffers_.push_back(std::move(currentBuffer_)); if (nextBuffer_) { currentBuffer_ = std::move(nextBuffer_); } else { currentBuffer_.reset(new Buffer); // Rarely happens } currentBuffer_->append(logline, len); cond_.notify(); } } void AsyncLogging::threadFunc() { assert(running_ == true); latch_.countDown(); LogFile output(basename_, rollSize_, false); BufferPtr newBuffer1(new Buffer); BufferPtr newBuffer2(new Buffer); newBuffer1->bzero(); newBuffer2->bzero(); BufferVector buffersToWrite; buffersToWrite.reserve(16); while (running_) { assert(newBuffer1 && newBuffer1->length() == 0); assert(newBuffer2 && newBuffer2->length() == 0); assert(buffersToWrite.empty()); { muduo::MutexLockGuard lock(mutex_); if (buffers_.empty()) // unusual usage! { cond_.waitForSeconds(flushInterval_); } buffers_.push_back(std::move(currentBuffer_)); currentBuffer_ = std::move(newBuffer1); buffersToWrite.swap(buffers_); if (!nextBuffer_) { nextBuffer_ = std::move(newBuffer2); } } assert(!buffersToWrite.empty()); if (buffersToWrite.size() > 2) { char buf[256]; snprintf(buf, sizeof buf, "Dropped log messages at %s, %zd larger buffers\n", Timestamp::now().toFormattedString().c_str(), buffersToWrite.size()-2); fputs(buf, stderr); output.append(buf, static_cast(strlen(buf))); buffersToWrite.erase(buffersToWrite.begin()+2, buffersToWrite.end()); } for (const auto& buffer : buffersToWrite) { // FIXME: use unbuffered stdio FILE ? or use ::writev ? output.append(buffer->data(), buffer->length()); } if (buffersToWrite.size() > 2) { // drop non-bzero-ed buffers, avoid trashing buffersToWrite.resize(2); } if (!newBuffer1) { assert(!buffersToWrite.empty()); newBuffer1 = std::move(buffersToWrite.back()); buffersToWrite.pop_back(); newBuffer1->reset(); } if (!newBuffer2) { assert(!buffersToWrite.empty()); newBuffer2 = std::move(buffersToWrite.back()); buffersToWrite.pop_back(); newBuffer2->reset(); } buffersToWrite.clear(); output.flush(); } output.flush(); }擴展
C++當前有兩個比較好用的log第三方庫easylogging++和spdlog,具體使用讀者可以網上找相關資料,個人推薦spdlog,后續(xù)會對spdlog進行原理分析。





