选择备份日志的存档器

大家好!


在本文中,我想谈一谈我如何选择存档器来压缩前台系统的日志。


我工作的部门负责开发和维护银行的统一前台系统。 我负责其维护,监视和DevOps。


我们的系统是一个高负载的应用程序,每天为5,000多个唯一用户提供服务。 如今,它已成为具有其优点和缺点的“整体”。 但是现在,将功能转移到微服务的过程正在积极进行。


每天,我们的系统都会生成130 GB以上的原始日志,尽管事实上我们使用ENG堆栈(Elasticsearch Nxlog Graylog),但文件日志包含更多信息(例如,堆栈跟踪错误),因此需要归档和存储。


由于存储位置有限,因此出现了一个问题:“以及哪个归档程序将最好地完成此任务。”


为了解决此问题,我编写了一个PowerShell脚本来为我执行分析。


该脚本的任务是调用具有不同压缩参数的rar,7z和zip存档器,以计算存档形成的速度以及所使用的磁盘空间。


ArchSearch.ps1
#Requires -Version 4.0 #  ,      Clear-Host #   ,     Set-Location $PSScriptRoot #  ,      $Archive = "Archive" $ArchFileName = "ArchFileName" [array]$path = (Get-ChildItem '.\logs').DirectoryName|Select-Object -Unique #      -.  ,     . if ((Test-Path -Path ".\$Archive") -ne $true){ New-Item -Path .\ -Name $Archive -ItemType Directory -Force } else { #    -    Get-ChildItem .\$Archive|Remove-Item -Recurse -Force } #      [array]$table=@() #   Rar #rar # m<0..5> Set compression level (0-store...3-default...5-maximal) 1..5|foreach{ $CompressionLevel = $("-m" + $_) $mc = Measure-Command {cmd /c .\rar.exe a -ep1 -ed $CompressionLevel -o+ -tsc .\$Archive\$($ArchFileName + $_) "$path"} [math]::Round(($mc.TotalMilliseconds), 0) $ArchFileNamePath = ".\$Archive\$($ArchFileName + $_ + ".rar")" $table += ""|Select-Object -Property @{name="ArchFileName"; expression={$ArchFileNamePath -split "\\"|Select-Object -Last 1}},` @{name="CompressionLevel"; expression={$CompressionLevel}},@{name="Extension"; expression={"rar"}},@{name="Size"; expression={(Get-ChildItem $ArchFileNamePath).Length}},` @{name="Time"; expression={[math]::Round(($mc.TotalMilliseconds), 0)}},` @{name="Size %"; expression={0}},@{name="Time %"; expression={0}},@{name="Result %"; expression={0}} } #   7z #7z # -mx[N] : set compression level: -mx1 (fastest) ... -mx9 (ultra) #cmd /c "$env:ProgramFiles\7-Zip\7z.exe" a -mx="$MX" -mmt="$MMT" -t7z -ssw -spf $($ArchFileName + "Fastest") "$path" 1..9|foreach{ $CompressionLevel = $("-mx=" + $_) $mc = Measure-Command {cmd /c "$env:ProgramFiles\7-Zip\7z.exe" a $CompressionLevel -t7z -ssw -spf .\$Archive\$($ArchFileName + $_) $path} #-mmt="$MMT" [math]::Round(($mc.TotalMilliseconds), 0) $ArchFileNamePath = ".\$Archive\$($ArchFileName + $_ + ".7z")" $table += ""|Select-Object -Property @{name="ArchFileName"; expression={$ArchFileNamePath -split "\\"|Select-Object -Last 1}},` @{name="CompressionLevel"; expression={$CompressionLevel}},@{name="Extension"; expression={"7z"}},@{name="Size"; expression={(Get-ChildItem $ArchFileNamePath).Length}},` @{name="Time"; expression={[math]::Round(($mc.TotalMilliseconds), 0)}},` @{name="Size %"; expression={0}},@{name="Time %"; expression={0}},@{name="Result %"; expression={0}} } #   zip (  PS "Compress-Archive") #zip 1..2|foreach{ Switch ($_){ 1{$CompressionLevel = "Fastest"} 2{$CompressionLevel = "Optimal"} } $mc = Measure-Command {Compress-Archive -Path $path -DestinationPath .\$Archive\$($ArchFileName + $_) -CompressionLevel $CompressionLevel -Force} [math]::Round(($mc.TotalMilliseconds), 0) $ArchFileNamePath = ".\$Archive\$($ArchFileName + $_ + ".zip")" $table += ""|Select-Object -Property @{name="ArchFileName"; expression={$ArchFileNamePath -split "\\"|Select-Object -Last 1}},` @{name="CompressionLevel"; expression={$CompressionLevel}},@{name="Extension"; expression={"zip"}},@{name="Size"; expression={(Get-ChildItem $ArchFileNamePath).Length}},` @{name="Time"; expression={[math]::Round(($mc.TotalMilliseconds), 0)}},` @{name="Size %"; expression={0}},@{name="Time %"; expression={0}},@{name="Result %"; expression={0}} } #     Size     [0] -     $Size = ($table|Sort-Object -Property Size)[0].Size / 100 #     Time     [0] -      $Time = ($table|Sort-Object -Property Time)[0].Time / 100 #    $table|foreach { $_.time $_."Size %" = [math]::Round(($_.Size / $Size), 0) $_."Time %" = [math]::Round(($_.Time / $Time), 0) if ($_."Size %" -ge $_."Time %"){ $_."Result %" = $_."Size %" - $_."Time %" } else { $_."Result %" = $_."Time %" - $_."Size %" } } #        "Size %" -     $table|Sort-Object -Property "Size %","Result %"|Select-Object -First 1|Format-Table -AutoSize #        "Result %" -          $table|Sort-Object -Property "Result %","Size %","Time %"|Select-Object -First 1|Format-Table -AutoSize $table|Sort-Object -Property "Size %","Result %"|Select-Object -First 1|Format-Table -AutoSize $table|Sort-Object -Property "Result %","Size %","Time %"|Select-Object -First 1|Format-Table -AutoSize #  !   ,     Get-ChildItem .\$Archive|Remove-Item -Force 

准备工作:


 #   ,   PowerShell  4.0 ($PSVersionTable): #Requires -Version 4.0 #  ,      Clear-Host #   ,     (   .\) Set-Location $PSScriptRoot 

 #  ,      $Archive = "Archive" $ArchFileName = "ArchFileName" [array]$path = (Get-ChildItem '.\logs').DirectoryName|Select-Object -Unique #      -.  ,      if ((Test-Path -Path ".\$Archive") -ne $true){ New-Item -Path .\ -Name $Archive -ItemType Directory -Force } else { #   ,     Get-ChildItem .\$Archive|Remove-Item -Recurse -Force } #      [array]$table=@() 

我们会更详细地分析


 #     1  5       foreach 1..5|foreach{ 

 #    $CompressionLevel   1  5($_),       $CompressionLevel = $("-m" + $_) 

 #  Measure-Command     ,   {} $mc = Measure-Command {cmd /c .\rar.exe a -ep1 -ed $CompressionLevel -o+ -tsc .\$Archive\$($ArchFileName + $_) "$path"} 

 #       [math]::Round(($mc.TotalMilliseconds), 0) #    $ArchFileName + $_ + ".rar": Archive1.rar $ArchFileNamePath = ".\$Archive\$($ArchFileName + $_ + ".rar")" #   $table (+=)   $table += ""|Select-Object -Property @{name="ArchFileName"; expression={$ArchFileNamePath -split "\\"|Select-Object -Last 1}},` @{name="CompressionLevel"; expression={$CompressionLevel}},@{name="Extension"; expression={"rar"}},@{name="Size"; expression={(Get-ChildItem $ArchFileNamePath).Length}},` @{name="SizeAVD"; expression={0}},@{name="Time"; expression={[math]::Round(($mc.TotalMilliseconds), 0)}},` @{name="Size %"; expression={0}},@{name="Time %"; expression={0}},@{name="Result %"; expression={0}} } 

 #     Size     [0] -     $Size = ($table|Sort-Object -Property Size)[0].Size / 100 #     Time     [0] -      $Time = ($table|Sort-Object -Property Time)[0].Time / 100 #    $table|foreach { $_.time $_."Size %" = [math]::Round(($_.Size / $Size), 0) $_."Time %" = [math]::Round(($_.Time / $Time), 0) if ($_."Size %" -ge $_."Time %"){ $_."Result %" = $_."Size %" - $_."Time %" } else { $_."Result %" = $_."Time %" - $_."Size %" } } #        "Size %" -     $table|Sort-Object -Property "Size %","Result %"|Select-Object -First 1|Format-Table -AutoSize #        "Result %" -          $table|Sort-Object -Property "Result %","Size %","Time %"|Select-Object -First 1|Format-Table -AutoSize #  !   ,    . Get-ChildItem .\$Archive|Remove-Item -Force 

 $table|Sort-Object -Property "Size %","Result %"|Format-Table -AutoSize 

档案名称压缩等级扩展名尺码时间尺寸%时间%结果%
ArchFileName8.7z-mx = 87z265115206240410016741574
ArchFileName9.7z-mx = 97z265115206461410017331633
ArchFileName7.7z-mx = 77z289481765283210914171308
ArchFileName6.7z-mx = 67z30051742373391131002889
ArchFileName5.7z-mx = 57z3123935535169118943825
ArchFileName4.rar-m4rar3351469311426126306180
ArchFileName5.rar-m5rar3346515212894126346220
ArchFileName3.rar-立方米rar336980799835127264137
ArchFileName2.rar-平方米rar34399885835213022494
ArchFileName4.7z-mx = 47z38926348647014717427
ArchFileName3.7z-mx = 37z44545819588916815810
ArchFileName2.7z-mx = 27z51690114475419512867
ArchFileName1.rar-m1rar53605833460020212379
ArchFileName1.7z-mx = 17z574721723728217100117
ArchFileName2.zip最佳的拉链6573324214025248376128
ArchFileName1.zip最快的拉链81556824903130824266

 $table|Sort-Object -Property "Result %","Size %","Time %"|Format-Table -AutoSize 

档案名称压缩等级扩展名尺码时间尺寸%时间%结果%
ArchFileName3.7z-mx = 37z44545819588916815810
ArchFileName4.7z-mx = 47z38926348647014717427
ArchFileName1.zip最快的拉链81556824903130824266
ArchFileName2.7z-mx = 27z51690114475419512867
ArchFileName1.rar-m1rar53605833460020212379
ArchFileName2.rar-平方米rar34399885835213022494
ArchFileName1.7z-mx = 17z574721723728217100117
ArchFileName2.zip最佳的拉链6573324214025248376128
ArchFileName3.rar-立方米rar336980799835127264137
ArchFileName4.rar-m4rar3351469311426126306180
ArchFileName5.rar-m5rar3346515212894126346220
ArchFileName5.7z-mx = 57z3123935535169118943825
ArchFileName6.7z-mx = 67z30051742373391131002889
ArchFileName7.7z-mx = 77z289481765283210914171308
ArchFileName8.7z-mx = 87z265115206240410016741574
ArchFileName9.7z-mx = 97z265115206461410017331633

 #  !   ,    . Get-ChildItem .\$Archive|Remove-Item -Force 

结果:
就磁盘空间而言,最经济的是7z,压缩比为-mx = 8和-mx = 9



最快的归档文件创建时间为7z,压缩比为-mx = 1



最佳速度和占用空间为7z,压缩比为-mx = 3



我们选择压缩率为-mx = 8的7z,因为归档文件的大小为-mx 9,但是它的工作速度更快。


太好了,选择了存档器和压缩率,现在让我们存档日志!


我们需要解决以下问题:


  1. 避免在操作过程中增加服务器负载。
  2. 使用子文件夹日志处理所有文件夹。
  3. 删除超过30天的归档文件(以免磁盘空间用完)。
  4. 每天创建档案,具体取决于修改文件的时间。

ArchLogs.ps1
 #Requires -Version 4.0 #          #SCHTASKS /Create /SC DAILY /ST 22:00 /TN ArchLogs /TR "C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe C:\Scripts\ArchLogs.ps1" /RU "NT AUTHORITY\NETWORKSERVICE" /F /RL HIGHEST #  ,      Clear-Host #   ,     Set-Location $PSScriptRoot #           . #  1 Function Set-MMTCore { $Core = (Get-WmiObject –class Win32_processor|Measure-Object NumberOfLogicalProcessors -sum).Sum / 2 $CoreTMP = $Core / 4 IF ($Core -lt 4) { $Core = $Core -1 } Else { $Core = $Core - $CoreTMP } [math]::Round($Core) } #    $MX = 8 #     $MMT = Set-MMTCore #  ,      $SearchFolder = "C:\inetpub" #     $SearchFolder $InetpubFolder = Get-ChildItem $SearchFolder #     Logs #  2 $LogsFolder = $InetpubFolder|foreach {Get-ChildItem $_.Fullname|Where-Object{$_.PSIsContainer -eq $true}|Where-Object{$_.name -eq "logs"}} #  ,      $Archive = "Archive" $ArchiveFile = "ArchFiles.txt" #      $LogsFolder   $LogsFolderName   foreach($LogsFolderName in $LogsFolder.Fullname){ $LogsFolderName #   $Archive,       IF ((Test-Path -Path "$LogsFolderName\$Archive") -ne $true){New-Item -Path $LogsFolderName -Name $Archive -ItemType Directory -Force} #    30  #  3 Get-ChildItem "$LogsFolderName\$Archive"|Where-Object {$_.LastWriteTime -le (Get-Date).AddDays(-30)}| Remove-Item -Force #     [Array]$ArchiveItems = Get-ChildItem -Path $LogsFolderName -Exclude $Archive #   , .    -  IF ($ArchiveItems.Count -ne ^_^quot&#0;quot^_^){ #       $AllLogsFiles = Get-ChildItem $ArchiveItems -Recurse <#-Filter *.log#>|Where-Object {$_.LastWriteTime -lt (Get-Date).Date} #     #  4 $AllData = ($AllLogsFiles|Sort-Object -Property LastWriteTime).LastWriteTime.Date|Select-Object -Unique $AllData foreach ($Data in $AllData){ IF ($ArchiveItems.Count -ne ^_^quot&#0;quot^_^){ #         ($AllLogsFiles|Where-Object {$_.LastWriteTime -lt (Get-Date).Date}|Where-Object {$_.LastWriteTime -ge (Get-Date $Data) -and $_.LastWriteTime -lt (Get-Date $Data).AddDays(1)}).FullName|Out-File .\$ArchiveFile -Force -Encoding default Write-Host "===" $Data Write-Host "===" #      IF ($(Get-Content .\ArchFiles.txt -Encoding default) -ne $null){ $ArchiveFileName =$(($LogsFolderName.Remove($LogsFolderName.LastIndexOf("\"))) -split "\\"|Select-Object -Last 1) + "_" + $(Get-Date $Data -Format dd-MM-yyyy) cmd /c "$env:ProgramFiles\7-Zip\7z.exe" a -mx="$MX" -mmt="$MMT" -t7z -ssw -sdel "$LogsFolderName\$Archive\$ArchiveFileName" "@$ArchiveFile" #        IF(Test-Path ".\$ArchiveFile"){Remove-Item ".\$ArchiveFile" -Force} #      Logs $LogsFolderName|foreach {Get-ChildItem $_|Where-Object {(Get-ChildItem -Path $_.FullName) -eq $null}|Remove-Item -Force} } } } } } 

在这里,我不会绘制脚本的每一行(在注释中进行描述),而是仅停留在Set-MMTCore函数上,该函数使我们能够计算7z的线程数,以便不将处理器加载到服务器上:


 Function Set-MMTCore { #        2 $Core = (Get-WmiObject –class Win32_processor|Measure-Object NumberOfLogicalProcessors -sum).Sum / 2 #      4 $CoreTMP = $Core / 4 IF ($Core -lt 4) { $Core = $Core -1 } Else { $Core = $Core - $CoreTMP } #    [math]::Round($Core) } 

不使用Set-MMTCore函数

可以看出,CPU处于100%静止状态。 这意味着我们将不可避免地在服务器上引起问题,并从监视系统收到警报。



使用Set-MMTCore函数时

可以看出,CPU为30-35%。 这意味着使用Set-MMTCore函数可以在不影响服务器操作的情况下存档文件。



备份脚本的结果:


归档前的文件夹:


归档后的文件夹:


在归档过程中创建的文件:


档案中的文件:


Source: https://habr.com/ru/post/zh-CN484414/


All Articles