• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    公众号

goaccess: GoAccess is a real-time web log analyzer and interactive viewer that r ...

原作者: [db:作者] 来自: 网络 收藏 邀请

开源软件名称:

goaccess

开源软件地址:

https://gitee.com/songboy/goaccess

开源软件介绍:

GoAccess Build Status GoAccess

What is it?

GoAccess is an open source real-time web log analyzer and interactiveviewer that *runs in a terminal in nix systems or through your browser. Itprovides fast and valuable HTTP statistics for system administrators thatrequire a visual server report on the fly.More info at: http://goaccess.io.

GoAccess Terminal DashboardGoAccess HTML Dashboard

Features

GoAccess parses the specified web log file and outputs the data to the Xterminal. Features include:

  • Completely Real TimeAll panels and metrics are timed to be updated every 200 ms on the terminaloutput and every second on the HTML output.

  • No configuration neededYou can just run it against your access log file, pick the log format andlet GoAccess parse the access log and show you the stats.

  • Track Application Response TimeTrack the time taken to serve the request. Extremely useful if you want totrack pages that are slowing down your site.

  • Nearly All Web Log FormatsGoAccess allows any custom log format string. Predefined options include,Apache, Nginx, Amazon S3, Elastic Load Balancing, CloudFront, etc

  • Incremental Log ProcessingNeed data persistence? GoAccess has the ability to process logs incrementallythrough the on-disk B+Tree database.

  • Only one dependencyGoAccess is written in C. To run it, you only need ncurses as a dependency.That's it. It even features its own Web Socket server - http://gwsocket.io/.

  • VisitorsDetermine the amount of hits, visitors, bandwidth, and metrics for slowestrunning requests by the hour, or date.

  • Metrics per Virtual HostHave multiple Virtual Hosts (Server Blocks)? A panel that displays whichvirtual host is consuming most of the web server resources.

  • Color Scheme CustomizableTailor GoAccess to suit your own color taste/schemes. Either through theterminal, or by simply applying the stylesheet on the HTML output.

  • Support for large datasetsGoAccess features an on-disk B+Tree storage for large datasets where it is notpossible to fit everything in memory.

Nearly all web log formats...

GoAccess allows any custom log format string. Predefined options include, butnot limited to:

  • Amazon CloudFront (Download Distribution).
  • Amazon Simple Storage Service (S3)
  • AWS Elastic Load Balancing
  • Combined Log Format (XLF/ELF) Apache | Nginx
  • Common Log Format (CLF) Apache
  • Google Cloud Storage.
  • Apache virtual hosts
  • Squid Native Format.
  • W3C format (IIS).

Why GoAccess?

GoAccess was designed to be a fast, terminal-based log analyzer. Its core ideais to quickly analyze and view web server statistics in real time withoutneeding to use your browser (great if you want to do a quick analysis of youraccess log via SSH, or if you simply love working in the terminal).

While the terminal output is the default output, it has the capability togenerate a complete real-time HTMLreport, as well as a JSON, andCSV report.

You can see it more of a monitor command tool than anything else.

Installation

GoAccess can be compiled and used on *nix systems.

Download, extract and compile GoAccess with:

$ wget http://tar.goaccess.io/goaccess-1.0.2.tar.gz$ tar -xzvf goaccess-1.0.2.tar.gz$ cd goaccess-1.0.2/$ ./configure --enable-geoip --enable-utf8$ make# make install

Build from GitHub (Development)

$ git clone https://github.com/allinurl/goaccess.git$ cd goaccess$ autoreconf -fiv$ ./configure --enable-geoip --enable-utf8$ make# make install

Distributions

It is easiest to install GoAccess on Linux using the preferred package managerof your Linux distribution.

Please note that not all distributions will have the lastest version ofGoAccess available

Debian/Ubuntu

# apt-get install goaccess

NOTE: It is likely this will install an outdated version of GoAccess. Tomake sure that you're running the latest stable version of GoAccess seealternative option below.

Official GoAccess Debian & Ubuntu repository

$ echo "deb http://deb.goaccess.io/ $(lsb_release -cs) main" | sudo tee -a /etc/apt/sources.list.d/goaccess.list$ wget -O - http://deb.goaccess.io/gnugpg.key | sudo apt-key add -$ sudo apt-get update$ sudo apt-get install goaccess

Note:

  • For on-disk support (Trusty+ or Wheezy+), run: sudo apt-get install goaccess-tcb
  • .deb packages in the official repo are available through https as well. You may need to install apt-transport-https.

Fedora

# yum install goaccess

Arch Linux

# pacman -S goaccess

Gentoo

# emerge net-analyzer/goaccess

OS X / Homebrew

# brew install goaccess

FreeBSD

# cd /usr/ports/sysutils/goaccess/ && make install clean# pkg install sysutils/goaccess

OpenBSD

# cd /usr/ports/www/goaccess && make install clean# pkg_add goaccess

OpenIndiana

# pkg install goaccess

pkgsrc (NetBSD, Solaris, SmartOS, ...)

# pkgin install goaccess

Windows

GoAccess can be used in Windows through Cygwin.

Storage

There are three storage options that can be used with GoAccess. Choosing onewill depend on your environment and needs.

Default Hash Tables

In-memory storage provides better performance at the cost of limiting thedataset size to the amount of available physical memory. By default GoAccessuses in-memory hash tables. If your dataset can fit in memory, then this willperform fine. It has very good memory usage and pretty good performance.

Tokyo Cabinet On-Disk B+ Tree

Use this storage method for large datasets where it is not possible to fiteverything in memory. The B+ tree database is slower than any of the hashdatabases since data has to be committed to disk. However, using an SSD greatlyincreases the performance. You may also use this storage method if you needdata persistence to quickly load statistics at a later date.

Tokyo Cabinet On-Memory Hash Database

An alternative to the default hash tables. It uses generic typing and thus it'sperformance in terms of memory and speed is average.

Command Line / Config Options

The following options can be supplied to the command or specified in theconfiguration file. If specified in the configuration file, long options needto be used without prepending --.

Command Line OptionDescription
-a --agent-listEnable a list of user-agents by host.
-c --config-dialogPrompt log/date configuration window.
-d --with-output-resolverEnable IP resolver on HTML
-e --exclude-ip=<IP>Exclude one or multiple IPv4/v6 including IP ranges.
-f --log-file=<filename>Path to input log file.
-g --std-geoipStandard GeoIP database for less memory usage.
-h --helpThis help.
`-H --http-protocol=<yesno>`
-i --hl-headerColor highlight active panel.
`-M --http-method=<yesno>`
-m --with-mouse Enable mouse support on main dashboard.
`-o --output=<file.[htmlcsv
-p --config-file=<filename>Custom configuration file.
-q --no-query-stringRemove request's query string. Can reduce mem usage.
-r --no-term-resolverDisable IP resolver on terminal output.
-s --storageDisplay current storage method. i.e., B+ Tree, Hash.
-V --versionDisplay version information and exit.
--444-as-404Treat non-standard status code 444 as 404.
--4xx-to-unique-countAdd 4xx client errors to the unique visitors count.
--addr=<addr>Specify IP address to bind server to.
--all-static-filesInclude static files that contain a query string.
--cache-lcnum=<number>Max number of leaf nodes to be cached. [1024]
--cache-ncnum=<number>Max number of non-leaf nodes to be cached. [512]
--color=<fg:bg[attrs, PANEL]>Specify custom colors.
`--color-scheme=<12
--compression=<zlib,bz2>Each page is compressed with ZLIB
--date-format=<dateformat>Specify log date format.
`--date-spec=<datehr>`
--db-path=<path>Path of the database file. [/tmp/]
--dcfDisplay the path of the default config file.
--debug-file=<path>Send all debug messages to the specified file.
--double-decodeDecode double-encoded values.
--enable-panel=<PANEL>Enable parsing and displaying the given panel.
--geoip-city-data=<path>Same as using --geoip-database.
--geoip-database=<path>Path to GeoIP database v4/v6. i.e., GeoLiteCity.dat
`--hour-spec=<hrmin>`
--html-custom-css=<path.css>Specify a custom CSS file in the HTML report.
--html-custom-js=<path.js>Specify a custom JS file in the HTML report.
--html-report-titleSet HTML report page title and header.
--ignore-crawlersIgnore crawlers.
--ignore-panel=<PANEL>Ignore parsing and displaying the given panel.
--ignore-referer=<referer>Ignore referers from being counted. Wildcards allowed.
--ignore-status=<CODE>Ignore parsing the given status code(s).
--invalid-requests=<filename>Log invalid requests to the specified file.
--json-pretty-printFormat JSON output using tabs and newlines.
--keep-db-filesPersist parsed data into disk.
--load-from-diskLoad previously stored data from disk.
--log-format="<logformat>"Specify log format. Inner quotes need to be escaped.
--max-itemsMaximum number of items to show per panel.
--no-colorDisable colored output.
--no-column-namesDon't write column names in term output.
--no-csv-summaryDisable summary metrics on the CSV output.
--no-global-configDo not load the global configuration file.
--no-progressDisable progress metrics.
--no-tab-scrollDisable scrolling through panels on TAB.
--no-html-last-updatedDo not show the last updated field in the HTML report.
--num-tests=<number>Number of lines to test against the given log format.
--origin=<url>Ensure clients send stated origin header on WS handshake.
--port=<port>Specify the port to use.
--real-osDisplay real OS names. e.g, Windows XP, Snow Leopard.
--real-time-htmlEnable real-time HTML output.
--sort-panel=PANEL,METRIC,ORDERSort panel on initial load. See manpage for metrics.
--static-file=<extension>Add static file extension. e.g.: .mp3, Case sensitive.
--time-format=<timeformat>Specify log time format.
--tune-bnum=<number>Number of elements of the bucket array. [32749]
--tune-lmemb=<number>Number of members in each leaf page. [128]
--tune-nmemb=<number>Number of members in each non-leaf page. [256]
--ws-url=<[scheme://]url[:port]>URL to which the WebSocket server responds.
--xmmap=<number>Set the size in bytes of the extra mapped memory. [0]

Usage

Different Outputs

To output to a terminal and generate an interactive report:

# goaccess -f access.log

To generate an HTML report:

# goaccess -f access.log -a > report.html

To generate a JSON report:

# goaccess -f access.log -a -d -o json > report.json

To generate a CSV file:

# goaccess -f access.log --no-csv-summary -o csv > report.csv

The -a flag indicates that we want to process an agent-list for every hostparsed.

The -d flag indicates that we want to enable the IP resolver on the HTML |JSON output. (It will take longer time to output since it has to resolve allqueries.)

The -c flag will prompt the date and log format configuration window. Onlywhen curses is initialized.

Multiple Log Files

Filtering can be done through the use of pipes. For instance, using grep tofilter specific data and then pipe the output into GoAccess. This adds a greatamount of flexibility to what GoAccess can display. For example:

If we would like to process all access.log.*.gz we can do one of the following:

# zcat -f access.log* | goaccess# zcat access.log.*.gz | goaccess

Note: On Mac OS X, use gunzip -c instead of zcat.

Real Time HTML Output

GoAccess has the ability the output real-time data in the HTML report. You caneven email the HTML file since it is composed of a single file with no externalfile dependencies, how neat is that!

To output an HTML report and set the WebSocket server to listen on port 7890and localhost.

# goaccess -f access.log -o report.html --real-time-html

If GoAccess is running and parsing logs on a specific host, you can specify theURL to which the client's browser will connect to.

# goaccess -f access.log -o report.html --real-time-html --ws-url=goaccess.io

To use a different port other than 7890, you can specify it as:

# goaccess -f access.log -o report.html --real-time-html --ws-url=goaccess.io --port=9870

And to bind the WebSocket server to a different address other than 0.0.0.0, youcan specify it as:

# goaccess -f access.log -o report.html --real-time-html --ws-url=goaccess.io --addr=127.0.0.1
Working with Dates

Another useful pipe would be filtering dates out of the web log

The following will get all HTTP requests starting on 05/Dec/2010 until theend of the file.

# sed -n '/05\/Dec\/2010/,$ p' access.log | goaccess -a

or using relative dates such as yesterdays or tomorrows day:

# sed -n '/'$(date '+%d\/%b\/%Y' -d '1 week ago')'/,$ p' access.log | goaccess -a

If we want to parse only a certain time-frame from DATE a to DATE b, we can do:

# sed -n '/5\/Nov\/2010/,/5\/Dec\/2010/ p' access.log | goaccess -a
Virtual Hosts

Assuming your log contains the virtual host field. For instance:

vhost.io:80 8.8.4.4 - - [02/Mar/2016:08:14:04 -0600] "GET /shop HTTP/1.1" 200 615 "-" "Googlebot-Image/1.0"

And you would like to append the virtual host to the request in order to seewhich virtual host the top urls belong to

awk '$8=$1$8' access.log | goaccess -a

To exclude a list of virtual hosts you can do the following:

# grep -v "`cat exclude_vhost_list_file`" vhost_access.log | goaccess
Files & Status Codes

To parse specific pages, e.g., page views, html, htm, php, etc. within arequest:

# awk '$7~/\.html|\.htm|\.php/' access.log | goaccess

Note, $7 is the request field for the common and combined log format,(without Virtual Host), if your log includes Virtual Host, then you probablywant to use $8 instead. It's best to check which field you are shooting for,e.g.:

# tail -10 access.log | awk '{print $8}'

Or to parse a specific status code, e.g., 500 (Internal Server Error):

# awk '$9~/500/' access.log | goaccess
Server

Also, it is worth pointing out that if we want to run GoAccess at lowerpriority, we can run it as:

# nice -n 19 goaccess -f access.log -a

and if you don't want to install it on your server, you can still run it fromyour local machine:

# ssh root@server 'cat /var/log/apache2/access.log' | goaccess -a
Incremental Log Processing

GoAccess has the ability to process logs incrementally through the on-diskB+Tree database. It works in the following way:

  1. A data set must be persisted first with --keep-db-files, then the samedata set can be loaded with --load-from-disk.
  2. If new data is passed (piped or through a log file), it will append it tothe original data set.
  3. To preserve the data at all times, --keep-db-files must be used.
  4. If --load-from-disk is used without --keep-db-files, database files willbe deleted upon closing the program.
Examples
// last month access log# goaccess -f access.log.1 --keep-db-files

then, load it with

// append this month access log, and preserve new data# goaccess -f access.log --load-from-disk --keep-db-files

To read persisted data only (without parsing new data)

# goaccess --load-from-disk --keep-db-files

Contributing

Any help on GoAccess is welcome. The most helpful way is to try it out and givefeedback. Feel free to use the Github issue tracker and pull requests todiscuss and submit code changes.

Enjoy!


鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
热门推荐
热门话题
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap