By leveraging machine learning (ML), businesses can differentiate themselves from their competitors by offering unique and innovative solutions powered by data-driven insights and predictions. These solutions, in turn, provide customers with a superior experience and drive business growth and success. That’s why at Cedar, we are using ML to build better patient experiences and streamline patients’ financial journeys through cutting-edge technologies, optimization and personalization. However, in order to fully realize the benefits of ML, we must have the right infrastructure in place.
ML infrastructure refers to a collection of hardware (e.g. cloud compute resources), software, tools and frameworks designed to facilitate the development, deployment and management of ML models.
Follow along as we share the business value of ML infrastructure for models in production at Cedar. Value-adds include:
- Improved model performance
Infrastructure security and protecting sensitive data are crucial aspects of ML infrastructure. At Cedar, compliance to regulations and contractual obligations surrounding healthcare data is a paramount consideration when designing any infrastructure. We prioritize encryption, strict access control to ML databases (along with other databases containing sensitive data), as well as enforcing logical separation of different providers’ data. By building out regulation-compliant infrastructures within our private network, Cedar ensures that sensitive data and models are stored and processed in a secure environment, protecting against data breaches and other security threats. This helps us maintain the trust of patients and healthcare providers while adhering to industry regulations.
Cedar requires both real-time and near-real-time prediction serving infrastructure, as demonstrated by two new ML applications:
- OCR (Optical Character Recognition) and NLP (Natural Language Processing) to detect key information from a patient’s insurance card in real-time
- ML to personalize payment plan lengths offered to a patient, in near real-time.
A generic ML infrastructure is designed to support a broad range of ML models and applications by including various types of model, data pipelines, model training and deployment pipelines. Essentially, it’s a single framework that encompasses the full ML lifecycle from data ingestion to model deployment and monitoring. Though it offers a versatile and wide-ranging foundation for various ML tasks, the inherent variability of each use-case's requirements may lead to potential challenges related to performance, scalability, and consequently, serviceability.
At Cedar, we allocate and run our ML models on specialized ML infrastructure, which is more reliable than running them on generic infrastructure. Utilizing a specialized infrastructure ensures that models and data solutions are consistently available, capable of handling failover and redundancy, thus reducing downtime risk. At Cedar, we leverage AWS (Amazon Web Services) services like SageMaker Endpoints and ECS (Elastic Container Service) to host our models based on the use-case requirements. SageMaker's ML-specific features like automatic scaling and multi-AZ deployments ensure reliable model availability, making it our preferred choice for standard ML workflows. For more complex use-cases requiring customization, we opt for ECS. Its flexible container orchestration and robust features like load balancing and automatic recovery provide reliable hosting for our models. By choosing the most appropriate services for each individual use-case, we ensure optimal reliability and performance for our models.
By hosting our models on AWS, we efficiently manage network communication and can streamline resource and application management. Such facilitation within a private network also allows us to significantly reduce latency.
Overall, this results in seamless integration between the models and multiple Cedar products, along with facilitating reuse of other resources within the infrastructure. Additionally, by maintaining complete visibility, and understanding, of the model and the data (from both training and inference) through our data infrastructure for ML, we can proactively resolve issues in the model since we have better understanding of the changes in both data and model behavior over time.
Another important benefit of our specialized ML infrastructure is that it easily enables scalability. As we deploy additional models for different use-cases and onboard additional clients leveraging these solutions, it’s important to choose the right hosting environment for each model. Utilizing AWS's extensive suite of scalable solutions, such as Auto Scaling groups, Elastic Load Balancing, Amazon EC2, and SageMaker, we can dynamically adjust computing resources that power our model serviceability in response to fluctuating workloads. This approach not only enhances the overall performance and reliability of our models but also ensures that our ML infrastructure is capable of adapting to changing demands, fostering an agile and resilient environment for model deployment and management.
Additionally, the infrastructure needs to account for new datasets to support new clients and use-cases, while addressing changes in datasets for existing clients and use-cases. A scalable data pipeline architecture supports flexible data ingestion by handling a variety of data types and formats, modular transformation and processing to easily add or modify transformation steps, and increasing volumes of data through distributed processing.
At Cedar, by adopting Snowflake as our unified data infrastructure for ML, we effectively addressed the scalability needs through a single, integrated solution for data storage, processing and analytics. Its cloud-native architecture ensures seamless scaling and efficiently handles growing datasets and complex models. Consolidating data management tasks within Snowflake streamlines data pipelines, minimizes operational overhead, and allows us to rapidly extract valuable insights and drive innovation with data-driven machine learning applications.
4) Improved model performance
ML is a very iterative process because developing and refining a model involves multiple cycles of design, testing and evaluation. A feedback loop is a key component of the iterative improvement of a model, as this process involves analyzing and incorporating information from the model’s performance back into the training process. For example, when a model is deployed into production and it underperforms the benchmark, easy access to diagnostic data and rapid feedback loops facilitate effective performance assessment. This enables proactive responses with appropriate measures, such as re-training the model with the latest data in response to data drift.
By equipping our infrastructure with automation that captures and integrates the up-to-date inference data into the model training and deployment pipeline, we can more rapidly iterate on current model diagnoses, development, and refinement, leading to improved performance and robustness of production models. Some key steps involved in supporting feedback loops are standardizing ML development, deployment processes, and the storage of necessary metadata from the data and model lifecycle after the ML experimentation phase. The metadata is later integrated into the ETL pipelines for ML inference, making it readily available in the data warehouse for both pre-defined and ad hoc analyses. It is also utilized in computing pre-defined analyses in our internal ML monitoring solution, MLeX. Through the pre-defined analyses in MLeX and ad hoc analyses, we proactively make informed decisions based on data-driven insights, ultimately maintaining the highest standards for our models in production.
Cedar uses ML models across multiple patient-facing and internal applications. By having a specialized infrastructure for ML, the company can avoid the costs associated with setting up and maintaining separate infrastructure for each model, while reusing resources and solutions from other use-cases. This can help to reduce costs and improve overall efficiency.
ML infrastructure in action through use-cases and ML best practices
Cedar significantly benefits from employing ML models in production, as they enhance end-to-end patient experiences and simplify the process of managing and paying for healthcare expenses. A prime example of such a benefit is our ML-powered discount model, which intelligently tailors discounts to individual patients based on their unique circumstances, leading to improved patient satisfaction and engagement (to learn more about the impact of our ML-powered discount model from Director of Data Science Sumayah Rahman, click here!).
Our robust ML infrastructure has been crucial in developing the discount model, efficiently managing and processing immense volumes of data. This data, encompassing patients' visits, payment histories, etc., holds essential patterns for the model’s learning process. Our feature engineering pipelines employ distributed data processing to efficiently extract, process, and engineer relevant features, which are then utilized in the experimentation process.
The ML experimentation process involves, among other tasks, feature selection using statistical and ML techniques, model experimentation with hyperparameter tuning, and performance evaluation across all developed models. The feature selection process refines hundreds of engineered features into various sets of statistically or predictively impactful features. For each refined feature set, the model experimentation process involves building multiple models, selecting the best-performing model, fine-tuning its hyperparameters and assessing both technical and anticipated business performance. Finally, the performance of all model variants is evaluated across all refined feature sets, and the best-performing model is closely examined before undergoing the deployment cycle. Each step of this process demands substantial computational resources.
As such, a specialized ML infrastructure has become more crucial than ever, providing the necessary computational resources, and the flexibility to streamline our processes to efficiently manage vast amounts of data, perform extensive feature experimentation, and refine models for optimal performance, all on a use case by use case basis. By leveraging and improving on our specialized infrastructure for ML, we continue to unlock the full potential of machine learning, driving innovation and delivering superior results. We enhance the performance, scalability, reliability, cost-effectiveness, and security of our ML models. Our ML infrastructure supports product differentiation in the market by enabling us to rapidly and efficiently build and deploy high-performance data-driven models. These models drive new insights, services, and capabilities, allowing us to provide unparalleled value to our patients.
About the Author: To learn more about Ethan Cha, check out his LinkedIn. You can also catch him presenting on ML monitoring at the Snowflake Summit at the end of June. Check back on the blog this Fall for a Snowflake Summit experience recap!