Organizations are discovering important worth utilizing an built-in expertise for all of your knowledge and AI with Amazon SageMaker Unified Studio. Nonetheless, many organizations require strict community management to satisfy safety and regulatory compliance necessities like HIPAA or FedRAMP for his or her knowledge and AI initiatives, whereas sustaining operational effectivity.
On this submit, we discover situations the place prospects want extra management over their community infrastructure when constructing their unified knowledge and analytics strategic layer. We’ll present how one can carry your personal Amazon Digital Personal Cloud (Amazon VPC) and arrange Amazon SageMaker Unified Studio for strict community management.
Resolution overview
The answer covers full technical know-how of a completely non-public community structure utilizing Amazon VPC with no public web publicity. The strategy leverages AWS PrivateLink by means of VPC endpoints to offer a safe communication between SageMaker Unified Studio and important AWS companies totally over the AWS spine community.
The structure consists of three core parts: a customized VPC named airgapped with a number of non-public subnets distributed throughout no less than three Availability Zones for top availability, a complete set of VPC interface and gateway endpoints for service connectivity, and the SageMaker Unified Studio area configured to function completely inside this remoted surroundings. This design helps make sure that delicate knowledge by no means traverses the general public web whereas sustaining full performance for knowledge cataloging, question execution, and machine studying workflows.
By implementing this air-gapped configuration, organizations achieve granular management over community site visitors, simplified compliance auditing, and the flexibility to combine SageMaker Unified Studio with current non-public knowledge sources by means of managed community pathways. The answer helps each quick operational wants and long-term scalability by means of cautious IP handle planning and modular endpoint structure.
Stipulations
The set up requires you to have an current VPC (for this submit, we’ll seek advice from the identify as airgapped however in actuality, it refers back to the VPC you want to securely arrange SageMaker Unified Studio). In case you don’t have an current VPC, you may comply with SageMaker Unified Studio area fast create administrator information to get began.
The excessive degree steps to create a VPC assembly minimal necessities for SageMaker Unified Studio are as follows:
- Within the AWS Administration Console, navigate to the VPC console.
- Select Create VPC.
- Choose the VPC and extra radio button.
- For Identify tag auto-generation, enter airgapped or a reputation of your selection.
- Maintain the default values for IPv4 CIDR block, IPv6 CIDR block, Tenancy, NAT gateways, VPC endpoints, and DNS choices.
- Choose 3 for Variety of Availability Zones (AZs).
- Choose 0 for Variety of public subnets.
- Select Create VPC.
This produces the next VPC useful resource map:

Determine 1 – VPC configuration
Set up SageMaker Unified Studio
Now, we’ll set up SageMaker Unified Studio in an current VPC, named airgapped-vpc.
- Navigate to the SageMaker console, select Domains within the navigation pane.
- Select Create Area.
- For How do you wish to arrange your area?, choose Fast set up.
- Develop the Fast set up settings
- Present a identify to your area, comparable to airgapped-domain.
- For Digital non-public cloud (VPC), choose airgapped-vpc.
- For subnets, choose a minimal of two non-public subnets.
- Select Proceed.
- Enter an e mail handle to create a person in AWS IAM Identification Middle.
- Select Create area.
- As soon as the area is created, select Open unified studio or use SageMaker Unified Studio URL below Area particulars to entry SageMaker Unified Studio.

Determine 2 – Amazon SageMaker Unified Studio URL Welcome Web page
- After logging in to SageMaker Unified Studio, create a venture utilizing the guided wizard.
- As soon as the venture is created, we have to add the mandatory VPC endpoints to permit site visitors from the venture to speak to AWS companies.
- S3 Gateway VPC endpoint was already chosen as a part of VPC creation step 5 in stipulations and thus created by default. Now we should add two extra VPC endpoints for Amazon DataZone and AWS Safety Token Service as illustrated in following step.
These are the minimal set of VPC endpoints to permit utilizing the tooling inside SageMaker Unified Studio. For an inventory of different necessary and non-mandatory VPC endpoints seek advice from the tables within the latter a part of this submit.
Create an interface endpoint
To create an interface endpoint, full following steps:
- Go to the SageMaker Unified Studio Challenge particulars web page and duplicate the Challenge ID.
Determine 3 – SageMaker Unifed Studio Challenge Particulars Web page - Go to the VPC console and select Endpoints.
- Select Create Endpoint.
- Enter a reputation for the endpoint, for instance, DataZone endpoint for SageMaker Unified Studio.
- For AWS Providers, enter DataZone.

Determine 4 – Interface Endpoint creation wizard for AWS Service datazone
- Choose Service Identify = com.amazonaws.us-east-1.datazone from the obtainable choices.

Determine 5 – Interface Endpoint creation wizard community settings
- Choose the subnets within the airgapped-vpc that you just created earlier.
- Filter the Safety Teams by pasting the copied Challenge ID.
- Choose the safety group with Group Identify datazone-<project-id>-dev.
- Select Create Endpoint.
- Repeat the identical steps to create a VPC endpoint for AWS STS.
- As soon as the VPC endpoints are created, validate connectivity within the SageMaker venture by working a SQL question or utilizing a Jupyterlab pocket book.
For a profitable area and venture which doesn’t get into any service degree utilization, the necessary VPC endpoints to be created are: S3 Gateway, DataZone, and STS interface endpoints. For different service utilization dependent operations like authentication, knowledge preview and dealing with compute, you’d require different necessary service particular endpoints defined later on this submit.
Finest practices for VPC set up for varied use circumstances
When establishing SageMaker Unified Studio area and venture profiles, you’ll want to specify the VPC community, subnets, and safety teams. Listed below are some finest practices round IP allocation, utilization quantity and anticipated progress to think about for various use circumstances inside enterprises.
Manufacturing and enterprise use circumstances
In case your group require strict community management to satisfy safety and compliance necessities for knowledge and AI initiatives, contemplate following finest practices in your manufacturing surroundings.
- Use the bring-your-own (BYO) VPC strategy to adjust to company-specific networking and safety necessities.
- Implement non-public networking utilizing VPC endpoints to maintain site visitors inside the AWS spine.
- Use no less than two non-public subnets throughout completely different Availability Zones.
- Allow DNS hostnames and DNS Help.
- Disable auto-assign public IP on subnets.
- Plan IP capability for no less than 5 years. A prescriptive steering for SageMaker Unified Studio is shared in VPC and Networking particulars part later on this submit. Contemplate the next:
- Variety of customers
- Variety of apps per person
- Variety of distinctive occasion varieties per person
- Common variety of coaching cases
- Anticipated progress proportion
Testing and non-production use circumstances
For growth, testing, non-prod surroundings the place use circumstances don’t have stringent safety and compliance necessities, use automated setup for fast experiments. Use pattern CloudFormation github templates as a part of the SageMaker Unified Studio specific set up, to automate area and venture creation. Nonetheless, this contains an Web Gateway which will not be appropriate for security-sensitive environments.
Personal networking use circumstances
VPCs with non-public subnets require important service endpoints to permit consumer assets like Amazon EC2 cases to securely entry AWS companies. The site visitors between your VPC and AWS companies stays inside AWS community avoiding public web publicity.
- Implement all necessary VPC endpoints for core companies (SageMaker, DataZone, Glue, and extra).
- Add elective endpoints based mostly on particular service wants, like IPv4 endpoints, dual-stack endpoints, and FIPS endpoints to programmatically hook up with an AWS service.
- Work with community directors for:
- Preinstalling wanted assets by means of safe channels like non-public subnets and self-referencing inbound guidelines in safety teams to allow restricted entry.
- Allowlisting solely needed exterior connections like NAT gateway IP and bastion host entry in firewall guidelines.
- Establishing acceptable proxy configurations if required.
Exterior knowledge supply entry use circumstances
Contemplate the next when working with exterior techniques like third-party SaaS platforms, on-premises databases, associate APIs, legacy techniques, or exterior distributors.
- Seek the advice of with community directors for acceptable connection strategies.
- Contemplate AWS PrivateLink integration the place obtainable.
- Implement acceptable safety measures for non-AWS knowledge your supply paperwork.
- For Excessive Availability:
- Deploy throughout no less than three completely different Availability Zones (no less than two for AWS Areas with solely two AZs).
- Confirm there’s a minimal of three free IPs per subnet.
- Contemplate bigger CIDR blocks (/16 beneficial) for future scalability.
VPC and networking particulars
On this part, we offer particulars of every networking facet beginning with selection of VPCs, community connectivity particulars for built-in companies to work, the idea of VPC and subnet necessities, and eventually the VPC endpoints required for personal service entry.
VPC
At a excessive degree, you’ve got two choices to produce VPCs and subnets:
- Carry-your-own (BYO) VPC. That is sometimes the case for many prospects, as most have firm particular networking and safety necessities to reuse an current VPC, or to create a VPC which might be compliant with these necessities.
- Create VPC with the SageMaker fast arrange template. When making a SageMaker Unified Studio area (DataZone V2 area in CloudFormation) by means of the automated fast set up, you can be proven a Fast create stack wizard in CloudFormation which creates VPCs and subnets used to configure your area.
Notice: The fast create stack utilizing template URL shouldn’t be meant for manufacturing use. The template creates an Web Gateway, which isn’t allowed in lots of enterprise settings. That is solely acceptable in case you are both making an attempt out SageMaker Unified Studio or, working SageMaker Unified Studio to be used circumstances that don’t have stringent safety necessities.In case you select this selection, you begin with SageMaker console, navigate to domains and click on Create area button, adopted by Create VPC button. You’ll navigate to CloudFormation and click on on Create stack button to create a pattern VPC named SageMakerUnifiedStudio-VPC with simply one-click for making an attempt out SageMaker Unified Studio.

Determine 6 – Create VPC button in SageMaker Unified Studio Create Area Wizard
Price estimation for beneficial VPC set up
The precise value relies on the configuration of your VPC. For extra complicated networking set ups (multi-VPC), it’s possible you’ll want to make use of extra networking parts comparable to a Transit Gateway, Community Firewall, and VPC Lattice. These parts could incur costs, and price relies on utilization and AWS Area. Interface VPC endpoints are charged per availability zone. Additionally they have a set and a variable part within the pricing construction. Use the AWS Pricing Calculator for an in depth estimate.
Community Connectivity
As regards to connectivity to the underlying AWS companies built-in inside SageMaker Unified Studio, there are two methods to allow connectivity (these should not Studio particular, these are commonplace methods to allow community connectivity inside a VPC). That is an essential safety consideration that relies on your group’s safety insurance policies.
- By way of the general public Web. Your site visitors will traverse over the general public Web by means of an Web Gateway in your VPC.
- Your VPC will need to have an Web Gateway connected to it.
- Your public subnet will need to have a NAT Gateway. As well as, your public subnet’s route desk will need to have a default route (
0.0.0.0for IPv4) to the Web Gateway. This route is what makes the subnet public. - Your non-public subnets will need to have a default path to the general public subnet’s NAT Gateway.
- By way of the AWS spine. Your site visitors will stay inside the non-public AWS spine by means of PrivateLink (by provisioning Interface and Gateway endpoints for the mandatory AWS companies in every Availability Zone).
- A listing of all of the AWS companies built-in into Studio and the VPC endpoints required will be present in part VPC Endpoints lined later on this submit.
- For non-AWS assets, sure exterior suppliers of those companies could provide PrivateLink integration. Verify with every supplier’s documentation and your community administrator to grasp essentially the most appropriate approach to connect with these exterior suppliers.
In a personal networking state of affairs, you will want to think about whether or not you want connectivity to non-AWS assets in a approach that’s compliant along with your group’s safety insurance policies. A couple of examples embrace the next:
- If you’ll want to obtain software program in your distant IDE host (for instance, command line packages, comparable to Ping and Traceroute)
- When you have code that connects to exterior APIs.
- In case you use software program (comparable to JupyterLab or Code Editor extensions) that depend on exterior APIs.
- In case you rely upon software program dependencies hosted within the public area (comparable to Maven, PyPi, npm)
- In case you want cross-Area entry to sure assets (comparable to entry to S3 buckets in a special Area)
- In case you want performance whose underlying AWS companies do not need VPC endpoints in all Areas or any Area.
- Amazon Q (powers Q and code recommendations)
- SQL Workbench (powers Question Editor)
- IAM (powers Glue connections)
If you’ll want to hook up with knowledge sources exterior of AWS (comparable to Snowflake, Microsoft SQL Server, Google BigQuery)
Enterprise community directors should additionally full both of the next stipulations to deal with non-public networking situations:
- Preinstall wanted assets by means of safe channels if attainable. An instance could be to customise your SageMaker AI picture by putting in dependencies, after they’re code scanned, vetted technically and legally by your group.
- If AWS PrivateLink integration shouldn’t be obtainable for exterior suppliers, allowlist community connections to those exterior sources. Enable firewall egress guidelines, straight or not directly, by means of a proxy in your group’s community. Verify along with your community administrator to grasp essentially the most acceptable choice to your group.
VPC Necessities
When establishing a brand new SageMaker Unified Studio Area, it’s needed to produce a VPC. It’s essential to notice that these VPC necessities are a union of all the necessities from the respective compute companies built-in into Studio, a few of that are strengthened by validation checks in the course of the corresponding blueprint’s deployment. If these necessities which have validation checks should not fulfilled, the useful resource(s) contained in that blueprint could fail to create on venture creation (on-create), or when creating the compute useful resource (on-demand). This part will current a abstract of those necessities, in addition to related documentation hyperlinks from which they originate.
Subnet necessities for particular compute in a VPC
This part lists the compute companies built-in in SageMaker Unified Studio that require VPC/subnets when provisioning the respective compute assets.
Compute Connections
Different Providers
Necessities
- Variety of subnets: Not less than two non-public subnets. This requirement comes from Redshift Serverless.
- Availability zones (AZs): Not less than two completely different AZs (for Areas with two AZs, two subnets are ample). This requirement comes from Redshift Serverless. For workgroups with Enhanced VPC Routing (EVR), you want three AZs.
- Free IPs per subnet: Not less than three Ips per subnet. This requirement comes from Redshift Serverless with out EVR. For detailed IP addresses requirement with EVR enabled workgroups, seek advice from Serverless utilization issues. Three is a minimal and will not be sufficient to your wants. For instance, EMR cluster creation will fail if no subnets with sufficient IPs are discovered within the VPC. We suggest doing a forward-looking capability planning train based mostly in your use circumstances (for instance, progress price, customers, compute wants) to venture no less than 5 years into the longer term. This helps to find out what number of IPs are wanted by the crew utilizing Studio and different companies that use this VPC and give you a ceiling for the CIDR block dimension.
- Personal or public subnets: We implement that no less than three non-public subnets be equipped, and suggest that solely non-public subnets are chosen, with a couple of nuances. This requirement comes from SageMaker AI area. A brand new SageMaker AI area, when set up with
VpcOnlymode, requires that each one subnets within the VPC be non-public. That is the default networking mode within the Tooling blueprint. In case you select to make use ofPublicInternetOnlymode, this restriction doesn’t apply, it’s possible you’ll select public subnets out of your VPC. To alter the mode, modify the Tooling Blueprint parametersagemakerDomainNetworkType. - Allow DNS hostname and DNS Help: Each have to be enabled. This requirement comes from EMR. With out these VPC settings,
enableDnsHostnameandenableDnsSupport, connecting to the EMR Cluster utilizing the non-public DNS identify by means of the Livy Endpoint will fail. SSL Verification, which might solely be achieved when connecting utilizing the DNS identify, not the IP. - Auto assign public IP: Disable. We suggest that this EC2 subnet setting (
mapPublicIpOnLaunch) be disabled when utilizing non-public subnets, as a result of public IPs come at a price and are a scarce useful resource within the whole addressable IPv4 area.
VPC endpoints
In case you select to run SageMaker Unified Studio with out public web entry, VPC endpoints are required for all companies SageMaker Unified Studio must entry. These endpoints present safe, non-public connectivity between your VPC and AWS companies with out traversing the general public web. The next desk lists the required endpoints, their varieties, and what every is used for.
Some endpoints could not present up straight in your browser’s community tab. The reason being that a few of these companies (comparable to CloudWatch) are transitively invoked by different companies.
Obligatory endpoints
The next are required endpoints for SageMaker Unified Studio and supporting companies to operate correctly. Gateway endpoints can be utilized the place obtainable, you should utilize interface endpoints for all different AWS companies.
| AWS service | Endpoint | Kind | Function |
| Glue | Interface | For Knowledge Catalog and metadata administration | |
| STS | Interface | Required for assuming IAM roles | |
| S3 | Gateway | Required for datasets, Git backups, notebooks, and Git sync | |
| SageMaker | Interface | Required for calling SageMaker APIs | |
| Interface | For invoking deployed inference endpoints | ||
| DataZone | Interface | For knowledge catalog and governance | |
| Secrets and techniques Supervisor | Interface | To securely entry secrets and techniques | |
| SSM | Interface | For safe command execution | |
| Interface | Permits reside SSM periods | ||
| KMS | Interface | For decrypting knowledge (volumes, S3, secrets and techniques) | |
| EC2 | Interface | For subnet and ENI administration | |
| Interface | Required for SSM messaging | ||
| Athena | Interface | Required to run SQL queries | |
| Amazon Q | Interface | Utilized by SageMaker Notebooks for enhanced productiveness |
Non-obligatory Endpoints
Solely create these if the corresponding service is utilized in your surroundings.
| AWS service | Endpoint | Kind | Function |
| EMR | Interface | Serverless Spark/Hive jobs | |
| Interface | Required for Livy job submission (EMR Serverless) | ||
| Interface | Basic EMR (EC2-based) | ||
| Interface | EMR on EKS workloads | ||
| Redshift | Interface | For provisioned Redshift clusters | |
| Interface | For Redshift Serverless | ||
| Interface | Required for working SQL towards Redshift | ||
| Amazon Bedrock | Interface | Invoke Bedrock fashions at runtime | |
| Interface | For Bedrock information brokers | ||
| Interface | For working information agent workloads | ||
| CloudWatch | Interface | Utility and pocket book logs | |
| RDS | Interface | Hook up with Amazon RDS and Aurora | |
| CodeCommit | Interface | Git integration with CodeCommit | |
| Interface | Various endpoint for CodeCommit | ||
| CodeConnections and CodeStar | Interface | GitHub and GitLab repo integration | |
| Interface | Alias of CodeConnections |
Clear up
AWS assets provisioned in your AWS accounts could incur prices based mostly on the assets consumed. Ensure you don’t go away any unintended assets provisioned. In case you created a VPC and subsequent assets as a part of this submit, be sure to delete them.
The next service assets provisioned throughout this weblog submit have to be deleted:
- IAM Identification Middle customers and teams.
- Sources provisioned inside your venture utilizing tooling configuration and blueprints inside your area.
- The airgapped VPC.
Conclusion
On this submit, we walked by means of the method of utilizing your personal current VPC when creating domains and initiatives in SageMaker Unified Studio. This strategy advantages prospects by giving them larger management over their community infrastructure whereas utilizing the excellent knowledge, analytics, and AI/ML capabilities of Amazon SageMaker. We additionally explored the crucial function of VPC endpoints on this set up. You now perceive when these develop into needed parts of your structure, notably in situations requiring enhanced safety, compliance with knowledge residency necessities, or improved community efficiency.
Whereas utilizing a customized VPC requires extra preliminary set up than the Fast Create choice, it offers the pliability and management many organizations want for his or her knowledge science and analytics workflows. This strategy offers a mechanism to your SageMaker surroundings to combine along with your current infrastructure and adheres to your group’s networking insurance policies. Customized VPC configurations are a robust device in your arsenal for constructing safe, compliant, and environment friendly knowledge science environments.
To be taught extra, go to Amazon SageMaker Unified Studio – Administrator Information and Consumer Information.


Determine 3 – SageMaker Unifed Studio Challenge Particulars Web page
