Exam Professional Machine Learning Engineer topic 1 question 277 discussion - ExamTopics

See original article

Problem

A PyTorch model deployed on nl-highcpu-16 machines in us-central1 region of Google Cloud exhibits high latency, particularly in Singapore. The model classifies transactions as fraudulent or not and uses numerical and categorical features.

Solutions and Analysis

Several solutions are proposed:

  • A. Attaching an NVIDIA T4 GPU: This might improve performance but doesn't address the geographical distance issue.
  • B. Changing to nl-highcpu-32 machines: Improves processing power but doesn't solve the latency problem in Singapore.
  • C. Deploying to Vertex AI private endpoints in both us-central1 and asia-southeast1: This allows the application to choose the nearest endpoint, directly addressing the latency issue in Singapore. This is deemed the correct answer.
  • D. Creating another Vertex AI endpoint in asia-southeast1: Similar to option C, but it doesn't leverage the existing us-central1 deployment.

Suggested Answer

The suggested answer is C, deploying the model to Vertex AI private endpoints in both the US and Singapore regions to minimize latency.

Sign up for a free account and get the following:
  • Save articles and sync them across your devices
  • Get a digest of the latest premium articles in your inbox twice a week, personalized to you (Coming soon).
  • Get access to our AI features