The World of NeRFs: Unveiling the Power of Neural Radiance Fields

In recent years, advancements in computer vision and 3D graphics have pushed the boundaries of what is achievable. One groundbreaking innovation that has gained significant attention is Neural Radiance Fields (NeRFs)[1], due to the decreasing prices of GPUs and the excitement surrounding it. NeRFs combine neural networks[2] and radiance fields[3] to revolutionize how we perceive and interact with digital content.

NeRFs represent a paradigm shift in computer graphics by allowing the creation of realistic 3D models and scenes from limited input data. Through deep learning and sophisticated mathematical models, NeRFs have the potential to transform industries such as entertainment, gaming, architecture, virtual reality[4] and medicine.

Traditional 3D graphics have relied on polygonal mesh models or voxel-based representations, which often fall short in capturing intricate details, realistic lighting and complex reflections. NeRFs address these limitations by providing a more accurate and immersive representation of 3D content.

At its core, a Neural Radiance Field is a learned function that predicts the appearance properties of a 3D point, such as color and opacity. Unlike traditional techniques, NeRFs learn shape and appearance directly from a training dataset, which enables them to capture fine-grained details and reproduce realistic materials and lighting conditions.

The magic of NeRFs lies in their ability to infer the radiance, or the amount of light traveling along a given ray in 3D space. By modeling the radiance field through a neural network, NeRFs can estimate the appearance of objects from different viewpoints and lighting conditions. This not only enables stunning visualizations but also facilitates interactive exploration of the 3D scene, allowing users to navigate and interact with virtual objects in an immersive manner.

NeRFs have vast applications across various industries. In gaming and entertainment, they can create photorealistic virtual worlds that blur the line between reality and simulation. Architects and designers can use NeRFs to visualize and iterate on complex architectural designs, providing clients with realistic virtual walkthroughs. Medical professionals can employ NeRFs for more accurate visualization of anatomical structures, aiding in surgical planning and training.

Join us on this captivating journey as we explore the power of NeRFs and witness the transformation of computer graphics and virtual experiences as we know them.

Our Journey: Choosing AWS SageMaker for End-to-End ML-Ops Training

POC Video 1 - Generated NeRF (on Google Colab) of an indoor living space from a regular video recorded on a handheld device.

POC Video 2 - Generated NeRF (on AWS SageMaker) of an indoor living space from a regular video recorded on a handheld device - which shows significant change in the quality of the rendered video.

When it comes to training and deploying machine learning (ML) models at scale, one platform that stands out is AWS SageMaker. With its comprehensive suite of tools and services, SageMaker offers a powerful solution for end-to-end ML-Ops training. In this section, we will explore why choosing AWS SageMaker is a compelling option, particularly when considering the wide usage of Hugging Face and its seamless integration with AWS services[5].
  1. Streamlined Workflow: AWS SageMaker provides a streamlined workflow for ML model development and deployment. It offers a unified interface that simplifies the entire ML lifecycle, from data preprocessing and model training to deployment and inference. This unified approach eliminates the need for managing multiple tools and services, resulting in increased productivity and reduced development time.
  2. Scalability and Flexibility: SageMaker leverages the elastic infrastructure of AWS, enabling seamless scalability to handle large datasets and complex ML models. With SageMaker, you can easily spin up instances with varying compute power, ensuring efficient resource utilization and faster training times. The flexibility to choose the appropriate instance type for your specific workload allows you to optimize costs while meeting your performance requirements.
  3. Hugging Face Integration: Hugging Face has emerged as a popular open-source library for natural language processing (NLP) tasks, offering a rich collection of pre-trained models and fine-tuning capabilities. SageMaker integrates seamlessly with Hugging Face, providing a cohesive environment for NLP model development and deployment. This integration allows you to leverage Hugging Face’s powerful transformers library, fine-tune models on your custom datasets, and deploy them easily using SageMaker’s hosting services.
  4. Managed Services and Auto Scaling: SageMaker offers a range of managed services that simplify ML-Ops tasks. For example, SageMaker Ground Truth facilitates the annotation of training data, accelerating the data labeling process. SageMaker also provides automated model tuning, which helps optimize hyperparameters for better model performance. Additionally, SageMaker’s Auto Scaling feature dynamically adjusts compute resources based on demand, ensuring cost efficiency and optimal performance.
  5. End-to-End ML-Ops Pipeline: With SageMaker, you can build a complete ML-Ops pipeline within the AWS ecosystem. From data ingestion with services like AWS Glue and AWS Data Pipeline, to model training and deployment with SageMaker, to monitoring and management using Amazon CloudWatch and AWS Step Functions, you have a comprehensive suite of tools at your disposal. This end-to-end integration simplifies the ML workflow, enhances collaboration and ensures smooth deployment and monitoring of ML models in production.
Figure 1 - AWS Sagemaker ML-OPs overview[6]
Figure 1 - AWS SageMaker ML-OPs overview[6]

In conclusion, choosing AWS SageMaker for end-to-end ML-Ops training offers numerous advantages. Its streamlined workflow, scalability, flexibility and integration with Hugging Face make it an attractive choice for data scientists and ML practitioners. By leveraging the power of AWS infrastructure and services, combined with the capabilities of Hugging Face, SageMaker enables efficient and effective ML model development and deployment, setting the stage for successful AI-driven applications.

In this blog post, we will further explore the ML/OPs capability of AWS SageMaker by training a NeRF from a regular video and rendering it into a pixel accurate volumetric representation of the 3D space. We will create a Jupyter Notebook[7] that runs on AWS SageMaker to train and render a complete pixel accurate volumetric representation of a 3D space recorded using a simple handheld device (in this case a Samsung Galaxy S22 Ultra with the default settings).

By the end of this blog, you will have gone through the following:

  • Part 1: Go through prerequisites for interacting with AWS SageMaker Studio.

  • Part 2: Import a Jupyter Notebook on AWS SageMaker Studio[8] from a GitHub repository. This can be any regular .ipynb notebook that you have built on any other cloud provider – for example Google Colab[9], Azure ML, your local machine, etc. We will be using AWS Accelerated instance types[10] and special containerized workloads that are optimized for deep learning[11]. Additionally, we will address a FAQ section that will discuss the changes we did to the Google Colab NeRF notebook to get it working with all underlying dependencies on SageMaker Studio.

  • Part 3: Render a full NeRF from a regular video with customizable camera positions, using an AWS SageMaker purpose built Jupyter Notebook that has been tested to generate the results covered in this blog.

So let’s get started.

Part 1: Setting up AWS SageMaker Studio

This section will largely focus on pointing/linking you to the right documentation that provides step by step guide to get started with AWS SageMaker and provide a brief summary of what each link gears you up for:
  1. Get started – Amazon SageMaker – This is a guide for getting started with Amazon SageMaker, a machine learning service provided by Amazon Web Services (AWS). It covers the basics of setting up and using SageMaker, including creating an IAM role, launching a notebook instance and running a sample notebook.
  2. Set Up Amazon SageMaker Prerequisites – Amazon SageMaker – This guide focuses on the setup process for using Amazon SageMaker. It provides step-by-step instructions for setting up your AWS account, creating an IAM role and configuring your environment to work with SageMaker. It also includes information on managing permissions and security settings.
  3. Onboard to Amazon SageMaker Domain – Amazon SageMaker – This explains how to onboard to Amazon SageMaker Studio, a web-based integrated development environment (IDE) for machine learning. It provides instructions on creating a SageMaker Studio domain, launching a Studio notebook and accessing the Studio interface. It also covers setting up Git integration and managing user profiles.
  4. Supported Regions and Quotas – Amazon SageMaker – (Optional/Supplementary information) This discusses the regions and quotas related to Amazon SageMaker. It provides an overview of the different AWS regions where SageMaker is available and explains the concept of quotas, which are limits on resource usage in SageMaker. The page also provides information on how to request a quota increase and highlights some of the key service limits to be aware of when using SageMaker.
At this point if you log into your AWS Management Console – SageMaker it should look as follows:
Figure 2 - AWS Sagemaker domain being live
Figure 2 - AWS SageMaker domain being live
Figure 3 - Your admin dashboard for AWS Sagemaker Studio
Figure 3 - Your admin dashboard for AWS SageMaker Studio

Once you are done with visiting (1), (2) and (3) above, you can continue to Part 2.

Part 2: Setting up an Example Notebook on SageMaker Studio With Accelerated Computing

In this part, we are going to use an open-sourced Google Colab notebook[9] that is part of the Nerfstudio[13][14] project. Since we have already addressed the services tradeoffs for using AWS SageMaker, we will primarily focus on getting a generic Colab ported to AWS SageMaker. But in a later section, Part 4, we will address the reason of why we would want to spend development time/effort to port Colab based Notebooks to AWS SageMaker and focus on the costs and performance aspects of this approach then. For now, let’s get our DevOps hat on and get ready to patch breaking stuff! Once ready, go ahead and open up AWS SageMaker Studio using your admin dashboard. Refer to Figure 2 for reference. The below FAQ section will deal with porting existing Google Colab notebooks to AWS Sagemaker. But in case you want to jump right into the action of NeRFs then feel free to use this public GitHub repository. It contains purpose built Jupyter notebooks that have been built and tested on AWS SageMaker for running containerized workloads that are optimized for deep learning. This notebook has annotated documentation for each cell. At this point, you can jump to Part 3.

However, at least reading through the below section should help in troubleshooting errors and understanding the general approach for making your selection for Instance and Container types in SageMaker Studio.

⚠️ Please read through the selection criteria for the Image and Instance types[16] for our containerized workloads in Part 3 before actually attempting to run the notebook. There might be a chance you need to rebuild one dependency of Nerfstudio down to the gcc kernel level and the npm libraries this notebook uses for rendering. So if this is your first time doing this, and you are not very familiar with Jupyter Notebooks or Linux operating systems, it's highly recommended that you use the exact spec for the Notebook instance and image types as shown in Part 3.

Getting NeRF Colab Imported Into AWS SageMaker

Clone the GitHub repository for NeRF[13]. Feel free to use the regular NeRF project Notebook or use one that is located in this public GitHub repository. It is the exact copy of the one hosted on Nerfstudio[14].
GitHub repository cloned for NERF
This will only clone a simple GitHub repository that contains the notebook that we are going to port and none of the additional stuff that comes along with open-source repositories.
  1. Once the repository is cloned use the File Browser and simply double-click the Colab Notebook as shown in the diagram below. It will launch the notebook on SageMaker and you will be prompted to choose the Environment Configuration[17]. Please refer to Part 3 for the configuration used for this blog, and make your selection accordingly.
  2. Go ahead and review the FAQs below for general questions.
  3. Remove all references to pip wheels stored and installed from Google Drive. Let’s replace the dependencies for each with the standard ones from their own documentation. You can run a diff command in the terminal to check how the notebooks deviate in their installation steps.
Installation happening using Google Drive store wheels in the Original Colab Notebook
Installation happening using Google Drive store wheels in the Original Colab Notebook.
Installation happening from standard repositories of all dependencies
Installation happening from standard repositories of all dependencies.

As you can see in the above comparison we are building COLMAP and TinuCUDA from RPM repositories leveraging the latest versions instead of snapshot versions.

  1. Use the terminal in session to debug any installation issues. Sometimes some dependencies might break. For example: The rendering uses NPM repositories that need to be installed. And that requires a specific gcc version. So you might need to install these libraries on the underlying compute environment.

⚠️ Please don’t hesitate to create an issue on the GitHub repository and I will be happy to assist. Also kindly create an issue if you feel there is some lack in the documentation of the Notebooks itself.

Once all the dependencies are installed you should be able to go ahead and render your own NeRF.

If you wish to use the data I used for the NeRF generated on the blog, please feel free to reach out by email and I would be happy to share the video.

Part 3: Generating NeRFs on AWS Sagemaker Studio

Environment Configuration [17]

Set up notebook environment.
  • Image: Tensorflow 2.10.0 Python 3.9 GPU Optimized
  • Kernel: Python3
  • Instance Type: ml.g4dn.2xlarge. Feel free to change as per need. But try being in the class of instances that have the Turning Architecture [18][19]. The G4 class instance type on AWS is backed by NVIDIA T4 GPUs and are hence use Turing Architecture with a tensor core that are optimized for deep learning.
  • Start-up Script: None.
Once the notebook is live you can get started! Starting up the notebook is pretty instantaneous. Your terminal should look like this:
NeRF Studio on AWS SageMaker Notebook

At this point, you can head over to the notebook itself. It has the necessary documentation per cell to guide you through the rest of the journey!

Happy NeRF-ing!

⚠️ If you encounter any issues with the Notebook while running it on AWS SageMaker Studio, please don’t hesitate to create an Issue on the GitHub repository and I will be happy to assist.

FAQs

Please refer to the FAQs in the Original Colab Notebook as well!

Technically, yes! Take a look at the following documentation to understand how Jupyter Notebooks are run on Sagemaker Studio. But if you are porting a notebook from other cloud providers, you will have to make some adjustments. Especially for services that the notebook would use in the native provider. In our case for example we had to do away with a Google file until that is used to interactively upload files directly from your local machine to your notebook. But this wasn’t a necessary functionality that we needed to port for the MVP. SageMaker’s file navigation provides upload capability which works faster that some compiled code on the notebook. You could replace the functionality with boto3 sdk for python to interact with S3 and then import the files into the compute environment’s file system, but for the purpose of a demo we didn’t need that.

For this blog and for any experimental ML/OPs workload that you are trying to build in a dev environment we definitely recommend SageMaker Studio Notebooks. Take a look at this documentation to for the comparison.

If it’s a high-resolution video it can take anywhere up to upwards of 1+ hour. You can increase the vCPU count by selecting a higher thread count G4DN instance which will bring the training speed down. But it comes with more costs. So we recommend smaller instances for training experimental workloads and then identify the scaling needed for a production workload. The training uses vCPU and the rendering uses the GPU.

Yes. The video used in this blog is of my own house recorded on a Samsung Galaxy S22 Ultra. I simply walked around my living room taking the video. It’s sampled as High Resolution so the training of this data took about ~=1 hour.

Reference

More Posts

Transform your HPC experience, streamline cluster creation and redefine the way you approach demanding computational workloads.

Learn how to establish a Docker-based Redis cluster on Mac OS for local development. Solve the issue of connecting to the cluster from the host network.

Discover the latest trends, best practices and strategies to safeguard your organization's data while unlocking the full potential of cloud technologies and AI-driven solutions.