✋ 🍁 🎒 zip归档文件是什么样子，我们可以做什么。第三部分-实际应用 🛴 ⛳️ ✊🏿

文章的继续zip存档的外观以及我们可以使用它做些什么。 第2部分-数据描述符和压缩。

亲爱的读者，我再次欢迎您使用PHP进行非常规编程的转移。为了了解正在发生的事情，我建议您阅读前两篇有关zip归档文件的文章： zip归档文件是什么样子，我们可以做什么？ zip归档文件是什么样子，我们可以使用它做什么。第2部分-数据描述符和压缩

之前，我谈到了如何仅使用PHP代码而不使用任何库和扩展名（包括标准zip ）来创建档案，还提到了一些使用场景。今天，我将尝试给出其中一种情况的示例。

我们将图片存储在远程服务器上的存档中，并且如有必要，向用户显示特定的图片，而无需下载或解压缩存档，而仅从服务器本身接收数据，即专门拍摄的图片，仅此而已（嗯，没有人取消了标头的开销但仍然）。

做饭

我的一个宠物项目有一个简单的脚本，我可以用它下载一堆图片并将其保存到档案中。该脚本接受json中的链接列表以输入STDIN，将存档本身提供给STDOUT，并在STDERR中将其具有数组结构的json（当然，这有点过头了，不必使用本机PHP创建存档，您可以为此使用更合适的工具，然后阅读结构-将来我会尝试强调这一点，但在现阶段，在我看来，这将更加明显）。

该脚本经过一些修改，下面引用：

zip.php

<?php $buffer = ''; while (!feof(STDIN)) { $buffer .= fread(STDIN, 4096); } $photos = json_decode($buffer, true, 512, JSON_THROW_ON_ERROR); $skeleton = ['files' => []]; $written = 0; [$written, $skeleton] = writeZip($written, $skeleton, $photos); $CDFH_Offset = $written; foreach ($skeleton['files'] as $index => $info) { $written += fwrite(STDOUT, $info['CDFH']); $skeleton['files'][$index]['CDFH'] = base64_encode($info['CDFH']); } $skeleton['EOCD'] = pack('LSSSSLLS', 0x06054b50, 0, 0, $records = count($skeleton['files']), $records, $written - $CDFH_Offset, $CDFH_Offset, 0); $written += fwrite(STDOUT, $skeleton['EOCD']); $skeleton['EOCD'] = base64_encode($skeleton['EOCD']); fwrite(STDERR, json_encode($skeleton)); function writeZip($written, $skeleton, $files) { $c = curl_init(); curl_setopt_array($c, [ CURLOPT_RETURNTRANSFER => 1, CURLOPT_TIMEOUT => 50, CURLOPT_FOLLOWLOCATION => true, ]); foreach ($files as $index => $url) { $fileName = $index . '.jpg'; for ($i = 0; $i < 1; $i++ ) { try { curl_setopt($c, CURLOPT_URL, $url); [$content, $code, $contentLength] = [ curl_exec($c), (int) curl_getinfo($c, CURLINFO_HTTP_CODE), (int) curl_getinfo($c, CURLINFO_CONTENT_LENGTH_DOWNLOAD) ]; if ($code !== 200) { throw new \Exception("[{$index}] " . 'Photo download error (' . $code . '): ' . curl_error($c)); } if (strlen($content) !== $contentLength) { var_dump(strlen($content), $contentLength); throw new \Exception("[{$index}] " . 'Different content-length'); } if ((false === $imageSize = @getimagesizefromstring($content)) || $imageSize[0] < 1 || $imageSize[1] < 1) { throw new \Exception("[{$index}] " . 'Broken image'); } [$width, $height] = $imageSize; $t = null; break; } catch (\Throwable $t) {} } if ($t !== null) { throw new \Exception('Error: ' . $index . ' > ' . $url, 0, $t); } $fileInfo = [ 'versionToExtract' => 10, 'generalPurposeBitFlag' => 0, 'compressionMethod' => 0, 'modificationTime' => 28021, 'modificationDate' => 20072, 'crc32' => hexdec(hash('crc32b', $content)), 'compressedSize' => $size = strlen($content), 'uncompressedSize' => $size, 'filenameLength' => strlen($fileName), 'extraFieldLength' => 0, ]; $LFH_Offset = $written; $skeleton['files'][$index] = [ 'LFH' => pack('LSSSSSLLLSSa*', 0x04034b50, ...array_values($fileInfo + ['filename' => $fileName])), 'CDFH' => pack('LSSSSSSLLLSSSSSLLa*', 0x02014b50, 798, ...array_values($fileInfo + [ 'fileCommentLength' => 0, 'diskNumber' => 0, 'internalFileAttributes' => 0, 'externalFileAttributes' => 2176057344, 'localFileHeaderOffset' => $LFH_Offset, 'filename' => $fileName, ])), 'width' => $width, 'height' => $height, ]; $written += fwrite(STDOUT, $skeleton['files'][$index]['LFH']); $written += fwrite(STDOUT, $content); $skeleton['files'][$index]['LFH'] = base64_encode($skeleton['files'][$index]['LFH']); } curl_close($c); return [$written, $skeleton]; }

让我们将其保存在某个地方，调用zip.php并尝试运行它。

我已经有一个现成的链接列表，我建议使用它，因为有大约18mb的图片，并且我们需要的总档案大小不超过20mb（稍后我会告诉原因）-https://gist.githubusercontent.com /userqq/d0ca3aba6b6762c9ce5bc3ace92a9f9e/raw/70f446eb98f1ba6838ad3c19c3346cba371fd263/photos.json

我们将像这样运行：

 $ curl -s \ https://gist.githubusercontent.com/userqq/d0ca3aba6b6762c9ce5bc3ace92a9f9e/raw/70f446eb98f1ba6838ad3c19c3346cba371fd263/photos.json \ | php zip.php \ 2> structure.json \ 1> photos.zip

脚本完成工作后，在输出中我们应该获得两个文件-photos.zip ，我建议使用以下命令进行检查：

 $ unzip -tq photos.zip

和structure.json ，其中我们将json和我们存档中所有内容的base64存储在一起，除了数据本身-所有LFH和CDFH结构以及EOCD。我还没有看到EOCD的任何实际应用与存档无关，但是LFH偏移相对于文件开头的位置以及数据长度在CDFH中显示。因此，知道LFH的长度后，我们当然可以知道数据将从何处开始以及在何处结束。

现在我们需要将文件上传到某个远程服务器。

例如，我将使用电报bot-这是最简单的，这就是为什么适应20mb限制如此重要的原因。

使用@BotFather注册该机器人，如果您还没有机器人，请向您的机器人写一些欢迎的信息，在https://api.telegram.org/bot{{TOKEN►►/getUpdates ，从中隔离message.from属性.id =这是您的聊天机器人的ID。

将您在上一步中获得的归档文件与您一起填写：

 $ curl -F document=@"photos.zip" "https://api.telegram.org/bot{{TOKEN}}/sendDocument?chat_id={{CHAT ID}}" > stored.json

现在我们已经有了两个json文件-structure.json和stored.json 。

如果一切顺利，则stored.json文件将包含具有确定字段等于true的json，以及我们需要的result.document.file_id。

样机

现在一切都准备就绪，让我们开始做生意：

 <?php define('TOKEN', ''); //    $structure = json_decode(file_get_contents('structure.json'), true, 512, JSON_THROW_ON_ERROR); $stored = json_decode(file_get_contents('stored.json'), true, 512, JSON_THROW_ON_ERROR); $selectedFile = $structure['files'][array_rand($structure['files'])]; $LFH = base64_decode($selectedFile['LFH']); $CDFH = base64_decode($selectedFile['CDFH']); $fileLength = unpack('Llen', substr($CDFH, 24, 4))['len']; $fileStart = unpack('Lpos', substr($CDFH, 42, 4))['pos'] + strlen($LFH); $fileEnd = $fileStart + $fileLength; $response = json_decode(file_get_contents('https://api.telegram.org/bot' . TOKEN . '/getFile?' . http_build_query([ 'file_id' => $stored['result']['document']['file_id'] ])), true, 512, JSON_THROW_ON_ERROR); header('Content-Type: image/jpeg'); readfile('https://api.telegram.org/file/bot' . TOKEN . '/' . $response['result']['file_path'], false, stream_context_create([ 'http' => ['header' => "Range: bytes={$fileStart}-{$fileEnd}"] ]));

在此脚本中，我们从存档中的列表中选择一个随机文件，找到LFH起始位置，向其中添加LFH长度，从而获得存档中特定文件的数据的起始位置。

加上文件长度，我们得到数据结尾的位置。

剩下的只是获取文件本身的地址（在特殊情况下是用电报），然后足以简单地以“ Range: bytes=<->-<->为标题发出请求Range: bytes=<->-<->

（假设服务器支持范围请求）

结果，当访问指定的脚本时，我们将从存档中看到随机图像的内容。

不，这不严重...

毋庸置疑 ，对getFile的请求应该被缓存，因为电报保证该链接至少存在一个小时。总的来说，这个例子除了证明它是有效的之外，无济于事。

让我们再努力一点-在amphp上全部运行。毕竟，人们在node.js上的生产中分发静态变量，但是是什么阻止了我们（当然，除了常识之外）？您也知道，我们在这里也有异步性。

我们将需要3个软件包：

amphp / http-server-显然，我们将文件分发到的服务器
amphp / artax-我们将通过其获取数据的http客户端，以及更新指向缓存中文件的直接链接。
amphp / parallel是一个用于多线程的库。但是请不要害怕，我们只需要其中的SharedMemoryParcel，它将用作内存缓存。

我们放入必要的依赖项：

 $ composer require amphp/http-server amphp/artax amphp/parallel

并写这个

剧本

 <?php define('TOKEN', ''); //    use Amp\Artax\DefaultClient; use Amp\Artax\Request as ArtaxRequest; use Amp\Http\Server\RequestHandler\CallableRequestHandler; use Amp\Http\Server\Server as HttpServer; use Amp\Http\Server\Request; use Amp\Http\Server\Response; use Amp\Http\Status; use Amp\Parallel\Sync\SharedMemoryParcel; use Amp\Socket; use Psr\Log\NullLogger; require 'vendor/autoload.php'; Amp\Loop::run(function () { $structure = json_decode(file_get_contents('structure.json'), true, 512, JSON_THROW_ON_ERROR); $stored = json_decode(file_get_contents('stored.json'), true, 512, JSON_THROW_ON_ERROR); $parcel = SharedMemoryParcel::create($id = bin2hex(random_bytes(10)), []); $client = new DefaultClient(); $handler = function (Request $request) use ($parcel, $client, $structure, $stored) { $cached = yield $parcel->synchronized(function (array $value) use ($client, $stored) { if (!isset($value['file_path']) || $value['expires'] <= time()) { $response = yield $client->request('https://api.telegram.org/bot' . TOKEN . '/getFile?' . http_build_query([ 'file_id' => $stored['result']['document']['file_id'] ])); $json = json_decode(yield $response->getBody(), true, 512, JSON_THROW_ON_ERROR); $value = ['file_path' => $json['result']['file_path'], 'expires' => time() + 55 * 60]; } return $value; }); $selectedFile = $structure['files'][array_rand($structure['files'])]; $LFH = base64_decode($selectedFile['LFH']); $CDFH = base64_decode($selectedFile['CDFH']); $fileLength = unpack('Llen', substr($CDFH, 24, 4))['len']; $fileStart = unpack('Lpos', substr($CDFH, 42, 4))['pos'] + strlen($LFH); $fileEnd = $fileStart + $fileLength; $request = (new ArtaxRequest('https://api.telegram.org/file/bot' . TOKEN . '/' . $cached['file_path'])) ->withHeader('Range', "bytes={$fileStart}-{$fileEnd}"); $response = yield $client->request($request); if (!in_array($response->getStatus(), [200, 206])) { return new Response(Status::SERVICE_UNAVAILABLE, ['content-type' => 'text/plain'], "Service Unavailable."); } return new Response(Status::OK, ['content-type' => 'image/jpeg'], $response->getBody()); }; $server = new HttpServer([Socket\listen("0.0.0.0:10051")], new CallableRequestHandler($handler), new NullLogger); yield $server->start(); Amp\Loop::onSignal(SIGINT, function (string $watcherId) use ($server) { Amp\Loop::cancel($watcherId); yield $server->stop(); }); });

结果是：我们现在有了一个缓存，在其中将指向文档的链接保存了55分钟，这还确保了，如果在缓存过期时收到很多请求，我们将不会连续多次请求链接。好吧，所有这些显然比使用PHP-FPM（或者，天哪，PHP-CGI）的readfile容易。

此外，您可以提高amphp实例池-SharedMemoryParcel的名称，暗示我们的缓存将在进程/线程之间混乱。好吧，或者，如果您仍然对此设计的可靠性有健康的担忧，那么可以使用proxy_pass和nginx 。

总的来说，这个想法并不是什么新鲜事，甚至在大胡子的岁月里，DownloadMaster允许您从Zip存档中下载特定文件而无需下载整个存档，因此不必单独存储结构，但是可以节省一些请求。例如，如果我们要动态将文件代理给用户该怎么办，它将在很大程度上影响请求的速度。

为什么这会派上用场？例如，为了保存inode，当您有数以千万计的小文件并且不想直接将它们存储在数据库中时，可以将它们存储在归档中，知道偏移量以接收单个文件。

或者，如果您要远程存储文件。当然，您无法绕开20mb的电报，但是谁知道，还有其他选择。

zip归档文件是什么样子，我们可以做什么。 第三部分-实际应用

做饭

样机

不，这不严重...

More articles:

zip归档文件是什么样子，我们可以做什么。第三部分-实际应用