🌍

ba2.4.3.3.1. title: 베이스라인은 모델이 최소한으로 확보해야 하는 성능에 대한 기준선을 의미하기도 하지만, 데이터 드리프트와 관련해서는 모델이 학습되었던 상황과 비교하여 받아들일 수 있는 데이터의 통계적 특성 차이의 기준선을 의미하기도 한다. 기준선은 학습 시 사용되었던 데이터셋(‣)으로부터 만들어진다.

생성

prev summary

🚀 prev note

♻️ prev note

ba2.4.3.3. title: 학습 시 사용했던 데이터와 새롭게 들어오는 데이터(‣)의 차이를 모니터링(‣)하면 모델 드리프트(‣)를 모니터링할 수 있다.

ba2.3.9.1. title: Start simple, 간단하게 시작하고 완성하라. 최대한 작은 모델, 작은 테크닉, 최대한 적은 데이터, 최대한 적은 코드만 이용해서 동작하는 시스템을 만들어라.

ba2.4.3.4. title: 모델의 성능 저하는 모델 드리프트 발생을 의미할 수 있다. 메트릭(‣)을 정하고 그것을 추적하면 모델 드리프트(‣)를 모니터링할 수 있다.

next summary

🚀 next note

♻️ next note

관련 임시노트

9 more properties

모델링 과정(from1)이나 모델 성능 모니터링(from3) 과정에서 베이스라인은 모델이 최소한으로 확보해야 하는 성능에 대한 기준선을 의미(참고7)하기도 하지만, 데이터 드리프트와 관련해서는 모델이 학습되었던 당시의 상황과 비교(from2)하여 받아들일 수 있는 데이터의 통계적 특성 차이의 기준선을 의미하기도 한다(참고2,8).

후자의 경우, 학습 당시를 대변하는 기준선은 학습 시 사용되었던 데이터셋 - 이는 베이스라인 데이터셋(‣ Baseline dataset) - 으로부터 만들어진다(참고2).

그림(참고4): 아래 그림은 AWS Sagemaker 모델 드리프트 모니터링 시스템 다이어그램이다. 좌측에 보면 Training Data 가 있는데, 이 데이터가 Baselining 태스크의 입력으로 들어가고 있음을 알 수 있다.

또다른 그림을 보자. 아래 그림은 AWS Sagemaker 공식 예제 문서에서 제공하는 모델 드리프트 시스템 다이어그램이다. Sagemaker Training job 으로 입력되는 데이터와 동일한 데이터가 Baseline Processing Job 에 입력된다. 처리 결과물은 베이스라인 데이터셋의 통계량과 제약조건임을 알 수 있다.

그림(참고5)

실제로 AWS Sagemaker Python SDK 를 이용해 아래와 같은 코드를 실행하면, S3에 constraints.json 파일과 statistics.json 파일이 저장된다.

코드(참고6)

from sagemaker.model_monitor.dataset_format import DatasetFormat 
from sagemaker import get_execution_role

s3_path = "s3://monitoring/xgb-churn-data"

monitor.suggest_baseline(
	baseline_dataset=s3_path + "/training-dataset.csv",
	dataset_format=DatasetFormat.csv(header=True),
	output_s3_uri=s3_path + "/baseline/",
	wait=True)
Python
복사

위 AWS 예제 스크립트의 경우 데이터 특징들의 통계량이나 타입 등을 기반으로 기본적인 베이스라인을 자동으로 생성하지만 모델의 성능과 관련된 베이스라인은 생성할 수 없다.

모델의 성능까지 모니터링하기 위해서는 현재 모델에 대한 성능 정보를 모니터에 제공할 수 있어야 하고, 성능 정보를 모니터에 제공하려면 ModelQualityMonitor 클래스를 사용해야 한다. 이 클래스의 인스턴스에서 suggest_baseline() 메서드를 호출하면 모델의 예측값과 정답값에 대한 정보를 인자로 전달할 수 있어 현재 모델의 성능 베이스라인을 만들 수 있게 된다.

코드(참고7)

probability,prediction,label
0.01516005303710699,0,0
0.1684480607509613,0,0
0.21427156031131744,0,0
0.06330718100070953,0,0
0.02791607193648815,0,0
0.014169521629810333,0,0
0.00571369007229805,0,0
0.10534518957138062,0,0
0.025899196043610573,0,0
Python
복사
test_data/validation_with_predictions.csv

model_quality_monitor = ModelQualityMonitor(
	...
)

job = model_quality_monitor.suggest_baseline(
    job_name=baseline_job_name,
    baseline_dataset=baseline_dataset_uri, # test_data/validation_with_predictions.csv
    dataset_format=DatasetFormat.csv(header=True),
    output_s3_uri = baseline_results_uri,
    problem_type='BinaryClassification',
    inference_attribute= "prediction",
    probability_attribute= "probability",
    ground_truth_attribute= "label"
)
job.wait(logs=False)
Python
복사

parse me : 언젠가 이 글에 쓰이면 좋을 것 같은 재료들.

None

from : 과거의 어떤 생각이 이 생각을 만들었는가?

ba2.3.9.1. title:
Start simple, 간단하게 시작하고 완성하라. 최대한 작은 모델, 작은 테크닉, 최대한 적은 데이터, 최대한 적은 코드만 이용해서 동작하는 시스템을 만들어라.

ba2.4.3.3. title: 학습 시 사용했던 데이터와 새롭게 들어오는 데이터(‣)의 차이를 모니터링(‣)하면 모델 드리프트(‣)를 모니터링할 수 있다.

ba2.4.3.4. title: 모델의 성능 저하는 모델 드리프트 발생을 의미할 수 있다. 메트릭(‣)을 정하고 그것을 추적하면 모델 드리프트(‣)를 모니터링할 수 있다.

supplementary : 어떤 새로운 생각이 이 문서에 작성된 생각을 뒷받침하는가?

None

opposite : 어떤 새로운 생각이 이 문서에 작성된 생각과 대조되는가?

None

to : 이 문서에 작성된 생각이 어떤 생각으로 발전되고 이어지는가?

None

참고 : 레퍼런스

chapter6, The first is the target dataset. This can be the same dataset you used to train a model, although special care needs to be put into ensuring that the number (and order) of features doesn’t change. The second one is the base‐ line. The baseline determines what differences may (or may not) be acceptable when a model gets trained.

The baselining task establishes a data profile from training data. It uses Amazon SageMaker Model Monitor and runs before training or re-training the model.

chapter6, It is common to find online documentation that refers to the base‐ line dataset interchangeably with target dataset, since both can ini‐ tially be the same. This can make it confusing when trying to grasp some of these concepts. It is useful to think about the baseline data‐ set as the data used to create the gold standard (the baseline) and any newer data as the target.

Basic architecture on how data drift is detected using Amazon SageMaker

Model Monitor uses rules to detect drift in your models and alerts you when it happens. The following figure shows how this process works in the case that your model is deployed to a real-time endpoint.

chatper6, Now that the monitor is available, use the suggest_baseline() method to produce a default baseline for the model … There should be two files saved in the configured S3 bucket: constraints.json and statistics.json.

you generate a baseline model quality that you can use to continuously monitor model quality against. To generate the model quality baseline, you first invoke the endpoint created earlier using validation data. Predictions from the deployed model using this validation data are used as a baseline dataset. You can use either the training or validation dataset to create the baseline.

you configure a processing job to generate statistical rules and constraints (referred to as your baseline) against which the model quality drift can be detected. Model Monitor suggests a set of default baseline statistics and constraints.