考题解析 | 使用 Amazon Kinesis Data Firehose 流式摄取海量数据


  题目

A company has thousands of edge devices that collectively generate 1 TB of status alerts each day. Each alert is approximately 2 KB in size. A solutions architect needs to implement a solution to ingest and store the alerts for future analysis. The company wants a highly available solution. However, the company needs to minimize costs and does not want to manage additional infrastructure. Additionally, the company wants to keep 14 days of data available for immediate analysis and archive any data older than 14 days.
What is the MOST operationally efficient solution that meets these requirements?
A. Create an Amazon Kinesis Data Firehose delivery stream to ingest the alerts. Configure the Kinesis Data Firehose stream to deliver the alerts to an Amazon S3 bucket. Set up an S3 Lifecycle configuration to transition data to Amazon S3 Glacier after 14 days.
B. Launch Amazon EC2 instances across two Availability Zones and place them behind an Elastic Load Balancer to ingest the alerts. Create a script on the EC2 instances that will store the alerts in an Amazon S3 bucket. Set up an S3 Lifecycle configuration to transition data to Amazon S3 Glacier after 14 days.
C. Create an Amazon Kinesis Data Firehose delivery stream to ingest the alerts. Configure the Kinesis Data Firehose stream to deliver the alerts to an Amazon OpenSearch Service (Amazon Elasticsearch Service) cluster. Set up the Amazon OpenSearch Service (Amazon Elasticsearch Service) cluster to take manual snapshots every day and delete data from the cluster that is older than 14 days.
D. Create an Amazon Simple Queue Service (Amazon SQS) standard queue to ingest the alerts, and set the message retention period to 14 days. Configure consumers to poll the SQS queue, check the age of the message, and analyze the message data as needed. If the message is 14 days old, the consumer should copy the message to an Amazon S3 bucket and delete the message from the SQS queue.

  参考答案

A

  参考解析

A. 正确。创建一个 Amazon Kinesis Data Firehose 传输流来接收警报。配置 Kinesis Data Firehose 流,将警报传递到 Amazon S3 存储桶。设置 S3 生命周期配置,以便在 14 天后将数据转换到 Amazon S3 Glacier。Amazon Kinesis Data Firehose 是一种完全托管的服务,可以轻松摄取、转换和加载流数据到 Amazon S3 等目的地。它具有高可用性,并且不需要用户管理基础设施,符合最小化成本和不想管理额外基础设施的要求。Amazon S3 是一种高度可用的对象存储服务,可以存储大量的数据。通过 S3 Lifecycle 配置,可以轻松地将超过 14 天的数据过渡到 Amazon S3 Glacier,这是一种成本较低的归档存储选项。该方案使用完全托管的 AWS 服务,具备高可用性,成本低,满足所有要求。
B. 不正确。跨两个可用区启动 Amazon EC2 实例,并将其放置在弹性负载平衡器后面以接收警报。在 EC2 实例上创建一个脚本,将警报存储在 Amazon S3 存储桶中。设置 S3 生命周期配置,以便在 14 天后将数据转换到 Amazon S3 Glacier。需要用户自己管理 EC2 实例,增加了基础设施管理的复杂性,不符合不想管理额外基础设施的要求。使用了 Elastic Load Balancer 可以提供高可用性,但增加了系统的复杂性和成本。
C. 不正确。创建一个 Amazon Kinesis Data Firehose 传输流来接收警报。配置 Kinesis Data Firehose 流,将警报传递到 Amazon OpenSearch Service(Amazon Elasticsearch Service)集群。设置 Amazon OpenSearch Service(Amazon Elasticsearch Service)集群,以便每天进行手动快照,并从集群中删除超过 14 天的数据。Amazon OpenSearch Service (Amazon Elasticsearch Service) 虽然可以用于分析数据,但成本较高,而且不适合长期存储大量的警报数据。此外,手动备份和删除数据的操作比较繁琐。不符合最小化成本和不想管理额外基础设施的要求。
D. 不正确。创建一个 Amazon 简单队列服务(Amazon SQS)标准队列来接收警报,并将消息保留期设置为 14 天。配置消费者以轮询 SQS 队列,检查消息的年龄,并根据需要分析消息数据。如果消息已有 14 天,则消费者应将消息复制到 Amazon S3 存储桶中,并从 SQS 队列中删除该消息。Amazon SQS 标准队列主要用于消息队列,不适合存储大量的警报数据。而且,消费者需要自己检查消息年龄并执行复制和删除操作,增加了系统的复杂性和成本,而且管理复杂,成本较高。