# Auto 클래스[[auto-classes]]

많은 경우, 사용하려는 아키텍처는 `from_pretrained()` 메소드에서 제공하는 사전 훈련된 모델의 이름이나 경로로부터 유추할 수 있습니다. AutoClasses는 이 작업을 위해 존재하며, 사전 학습된 모델 가중치/구성/단어사전에 대한 이름/경로를 제공하면 자동으로 관련 모델을 가져오도록 도와줍니다.

[AutoConfig](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoConfig), [AutoModel](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel), [AutoTokenizer](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoTokenizer) 중 하나를 인스턴스화하면 해당 아키텍처의 클래스를 직접 생성합니다. 예를 들어,

```python
model = AutoModel.from_pretrained("google-bert/bert-base-cased")
```

위 코드는 [BertModel](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertModel)의 인스턴스인 모델을 생성합니다.

각 작업에 대해 하나의 `AutoModel` 클래스가 있으며, 각각의 백엔드(PyTorch, TensorFlow 또는 Flax)에 해당하는 클래스가 존재합니다.

## 자동 클래스 확장[[extending-the-auto-classes]]

각 자동 클래스는 사용자의 커스텀 클래스로 확장될 수 있는 메소드를 가지고 있습니다. 예를 들어, `NewModel`이라는 커스텀 모델 클래스를 정의했다면, `NewModelConfig`를 준비한 후 다음과 같이 자동 클래스에 추가할 수 있습니다:

```python
from transformers import AutoConfig, AutoModel

AutoConfig.register("new-model", NewModelConfig)
AutoModel.register(NewModelConfig, NewModel)
```

이후에는 일반적으로 자동 클래스를 사용하는 것처럼 사용할 수 있습니다!

만약 `NewModelConfig`가 [PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)의 서브클래스라면, 해당 `model_type` 속성이 등록할 때 사용하는 키(여기서는 `"new-model"`)와 동일하게 설정되어 있는지 확인하세요.

마찬가지로, `NewModel`이 [PreTrainedModel](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel)의 서브클래스라면, 해당 `config_class` 속성이 등록할 때 사용하는 클래스(여기서는 `NewModelConfig`)와 동일하게 설정되어 있는지 확인하세요.

## AutoConfig[[transformers.AutoConfig]][[transformers.AutoConfig]]

#### transformers.AutoConfig[[transformers.AutoConfig]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/configuration_auto.py#L1207)

This is a generic configuration class that will be instantiated as one of the configuration classes of the library
when created with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoConfig.from_pretrained) class method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_pretrainedtransformers.AutoConfig.from_pretrainedhttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/configuration_auto.py#L1230[{"name": "pretrained_model_name_or_path", "val": ": typing.Union[str, os.PathLike[str]]"}, {"name": "**kwargs", "val": ""}]- **pretrained_model_name_or_path** (`str` or `os.PathLike`) --
  Can be either:

  - A string, the *model id* of a pretrained model configuration hosted inside a model repo on
    huggingface.co.
  - A path to a *directory* containing a configuration file saved using the
    [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.save_pretrained) method, or the [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) method,
    e.g., `./my_model_directory/`.
  - A path or url to a saved configuration JSON *file*, e.g.,
    `./my_model_directory/configuration.json`.
- **cache_dir** (`str` or `os.PathLike`, *optional*) --
  Path to a directory in which a downloaded pretrained model configuration should be cached if the
  standard cache should not be used.
- **force_download** (`bool`, *optional*, defaults to `False`) --
  Whether or not to force the (re-)download the model weights and configuration files and override the
  cached versions if they exist.
- **resume_download** --
  Deprecated and ignored. All downloads are now resumed by default when possible.
  Will be removed in v5 of Transformers.
- **proxies** (`dict[str, str]`, *optional*) --
  A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128',
  'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.
- **revision** (`str`, *optional*, defaults to `"main"`) --
  The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a
  git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any
  identifier allowed by git.
- **return_unused_kwargs** (`bool`, *optional*, defaults to `False`) --
  If `False`, then this function returns just the final configuration object.

  If `True`, then this functions returns a `Tuple(config, unused_kwargs)` where *unused_kwargs* is a
  dictionary consisting of the key/value pairs whose keys are not configuration attributes: i.e., the
  part of `kwargs` which has not been used to update `config` and is otherwise ignored.
- **trust_remote_code** (`bool`, *optional*, defaults to `False`) --
  Whether or not to allow for custom models defined on the Hub in their own modeling files. This option
  should only be set to `True` for repositories you trust and in which you have read the code, as it will
  execute code present on the Hub on your local machine.
- **kwargs(additional** keyword arguments, *optional*) --
  The values in kwargs of any keys which are configuration attributes will be used to override the loaded
  values. Behavior concerning key/value pairs whose keys are *not* configuration attributes is controlled
  by the `return_unused_kwargs` keyword parameter.0

Instantiate one of the configuration classes of the library from a pretrained model configuration.

The configuration class to instantiate is selected based on the `model_type` property of the config object that
is loaded, or when it's missing, by falling back to using pattern matching on `pretrained_model_name_or_path`:

- **aimv2** -- `Aimv2Config` (AIMv2 model)
- **aimv2_vision_model** -- `Aimv2VisionConfig` (Aimv2VisionModel model)
- **albert** -- [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) (ALBERT model)
- **align** -- `AlignConfig` (ALIGN model)
- **altclip** -- [AltCLIPConfig](/docs/transformers/v4.57.1/ko/model_doc/altclip#transformers.AltCLIPConfig) (AltCLIP model)
- **apertus** -- `ApertusConfig` (Apertus model)
- **arcee** -- `ArceeConfig` (Arcee model)
- **aria** -- `AriaConfig` (Aria model)
- **aria_text** -- `AriaTextConfig` (AriaText model)
- **audio-spectrogram-transformer** -- `ASTConfig` (Audio Spectrogram Transformer model)
- **autoformer** -- [AutoformerConfig](/docs/transformers/v4.57.1/ko/model_doc/autoformer#transformers.AutoformerConfig) (Autoformer model)
- **aya_vision** -- `AyaVisionConfig` (AyaVision model)
- **bamba** -- `BambaConfig` (Bamba model)
- **bark** -- `BarkConfig` (Bark model)
- **bart** -- [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) (BART model)
- **beit** -- `BeitConfig` (BEiT model)
- **bert** -- [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) (BERT model)
- **bert-generation** -- `BertGenerationConfig` (Bert Generation model)
- **big_bird** -- `BigBirdConfig` (BigBird model)
- **bigbird_pegasus** -- `BigBirdPegasusConfig` (BigBird-Pegasus model)
- **biogpt** -- [BioGptConfig](/docs/transformers/v4.57.1/ko/model_doc/biogpt#transformers.BioGptConfig) (BioGpt model)
- **bit** -- `BitConfig` (BiT model)
- **bitnet** -- `BitNetConfig` (BitNet model)
- **blenderbot** -- `BlenderbotConfig` (Blenderbot model)
- **blenderbot-small** -- `BlenderbotSmallConfig` (BlenderbotSmall model)
- **blip** -- [BlipConfig](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.BlipConfig) (BLIP model)
- **blip-2** -- [Blip2Config](/docs/transformers/v4.57.1/ko/model_doc/blip-2#transformers.Blip2Config) (BLIP-2 model)
- **blip_2_qformer** -- [Blip2QFormerConfig](/docs/transformers/v4.57.1/ko/model_doc/blip-2#transformers.Blip2QFormerConfig) (BLIP-2 QFormer model)
- **bloom** -- `BloomConfig` (BLOOM model)
- **blt** -- `BltConfig` (Blt model)
- **bridgetower** -- `BridgeTowerConfig` (BridgeTower model)
- **bros** -- `BrosConfig` (BROS model)
- **camembert** -- `CamembertConfig` (CamemBERT model)
- **canine** -- `CanineConfig` (CANINE model)
- **chameleon** -- [ChameleonConfig](/docs/transformers/v4.57.1/ko/model_doc/chameleon#transformers.ChameleonConfig) (Chameleon model)
- **chinese_clip** -- `ChineseCLIPConfig` (Chinese-CLIP model)
- **chinese_clip_vision_model** -- `ChineseCLIPVisionConfig` (ChineseCLIPVisionModel model)
- **clap** -- `ClapConfig` (CLAP model)
- **clip** -- [CLIPConfig](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPConfig) (CLIP model)
- **clip_text_model** -- [CLIPTextConfig](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPTextConfig) (CLIPTextModel model)
- **clip_vision_model** -- [CLIPVisionConfig](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPVisionConfig) (CLIPVisionModel model)
- **clipseg** -- [CLIPSegConfig](/docs/transformers/v4.57.1/ko/model_doc/clipseg#transformers.CLIPSegConfig) (CLIPSeg model)
- **clvp** -- `ClvpConfig` (CLVP model)
- **code_llama** -- [LlamaConfig](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaConfig) (CodeLlama model)
- **codegen** -- [CodeGenConfig](/docs/transformers/v4.57.1/ko/model_doc/codegen#transformers.CodeGenConfig) (CodeGen model)
- **cohere** -- [CohereConfig](/docs/transformers/v4.57.1/ko/model_doc/cohere#transformers.CohereConfig) (Cohere model)
- **cohere2** -- `Cohere2Config` (Cohere2 model)
- **cohere2_vision** -- `Cohere2VisionConfig` (Cohere2Vision model)
- **colpali** -- `ColPaliConfig` (ColPali model)
- **colqwen2** -- `ColQwen2Config` (ColQwen2 model)
- **conditional_detr** -- `ConditionalDetrConfig` (Conditional DETR model)
- **convbert** -- [ConvBertConfig](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertConfig) (ConvBERT model)
- **convnext** -- `ConvNextConfig` (ConvNeXT model)
- **convnextv2** -- `ConvNextV2Config` (ConvNeXTV2 model)
- **cpmant** -- `CpmAntConfig` (CPM-Ant model)
- **csm** -- `CsmConfig` (CSM model)
- **ctrl** -- `CTRLConfig` (CTRL model)
- **cvt** -- `CvtConfig` (CvT model)
- **d_fine** -- `DFineConfig` (D-FINE model)
- **dab-detr** -- `DabDetrConfig` (DAB-DETR model)
- **dac** -- `DacConfig` (DAC model)
- **data2vec-audio** -- `Data2VecAudioConfig` (Data2VecAudio model)
- **data2vec-text** -- `Data2VecTextConfig` (Data2VecText model)
- **data2vec-vision** -- `Data2VecVisionConfig` (Data2VecVision model)
- **dbrx** -- [DbrxConfig](/docs/transformers/v4.57.1/ko/model_doc/dbrx#transformers.DbrxConfig) (DBRX model)
- **deberta** -- [DebertaConfig](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaConfig) (DeBERTa model)
- **deberta-v2** -- [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) (DeBERTa-v2 model)
- **decision_transformer** -- `DecisionTransformerConfig` (Decision Transformer model)
- **deepseek_v2** -- `DeepseekV2Config` (DeepSeek-V2 model)
- **deepseek_v3** -- [DeepseekV3Config](/docs/transformers/v4.57.1/ko/model_doc/deepseek_v3#transformers.DeepseekV3Config) (DeepSeek-V3 model)
- **deepseek_vl** -- `DeepseekVLConfig` (DeepseekVL model)
- **deepseek_vl_hybrid** -- `DeepseekVLHybridConfig` (DeepseekVLHybrid model)
- **deformable_detr** -- `DeformableDetrConfig` (Deformable DETR model)
- **deit** -- `DeiTConfig` (DeiT model)
- **depth_anything** -- `DepthAnythingConfig` (Depth Anything model)
- **depth_pro** -- `DepthProConfig` (DepthPro model)
- **deta** -- `DetaConfig` (DETA model)
- **detr** -- `DetrConfig` (DETR model)
- **dia** -- `DiaConfig` (Dia model)
- **diffllama** -- `DiffLlamaConfig` (DiffLlama model)
- **dinat** -- `DinatConfig` (DiNAT model)
- **dinov2** -- `Dinov2Config` (DINOv2 model)
- **dinov2_with_registers** -- `Dinov2WithRegistersConfig` (DINOv2 with Registers model)
- **dinov3_convnext** -- `DINOv3ConvNextConfig` (DINOv3 ConvNext model)
- **dinov3_vit** -- `DINOv3ViTConfig` (DINOv3 ViT model)
- **distilbert** -- `DistilBertConfig` (DistilBERT model)
- **doge** -- `DogeConfig` (Doge model)
- **donut-swin** -- `DonutSwinConfig` (DonutSwin model)
- **dots1** -- `Dots1Config` (dots1 model)
- **dpr** -- `DPRConfig` (DPR model)
- **dpt** -- `DPTConfig` (DPT model)
- **edgetam** -- `EdgeTamConfig` (EdgeTAM model)
- **edgetam_video** -- `EdgeTamVideoConfig` (EdgeTamVideo model)
- **edgetam_vision_model** -- `EdgeTamVisionConfig` (EdgeTamVisionModel model)
- **efficientformer** -- `EfficientFormerConfig` (EfficientFormer model)
- **efficientloftr** -- `EfficientLoFTRConfig` (EfficientLoFTR model)
- **efficientnet** -- `EfficientNetConfig` (EfficientNet model)
- **electra** -- [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) (ELECTRA model)
- **emu3** -- `Emu3Config` (Emu3 model)
- **encodec** -- `EncodecConfig` (EnCodec model)
- **encoder-decoder** -- [EncoderDecoderConfig](/docs/transformers/v4.57.1/ko/model_doc/encoder-decoder#transformers.EncoderDecoderConfig) (Encoder decoder model)
- **eomt** -- `EomtConfig` (EoMT model)
- **ernie** -- `ErnieConfig` (ERNIE model)
- **ernie4_5** -- `Ernie4_5Config` (Ernie4_5 model)
- **ernie4_5_moe** -- `Ernie4_5_MoeConfig` (Ernie4_5_MoE model)
- **ernie_m** -- `ErnieMConfig` (ErnieM model)
- **esm** -- [EsmConfig](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmConfig) (ESM model)
- **evolla** -- `EvollaConfig` (Evolla model)
- **exaone4** -- [Exaone4Config](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4Config) (EXAONE-4.0 model)
- **falcon** -- `FalconConfig` (Falcon model)
- **falcon_h1** -- `FalconH1Config` (FalconH1 model)
- **falcon_mamba** -- `FalconMambaConfig` (FalconMamba model)
- **fastspeech2_conformer** -- `FastSpeech2ConformerConfig` (FastSpeech2Conformer model)
- **fastspeech2_conformer_with_hifigan** -- `FastSpeech2ConformerWithHifiGanConfig` (FastSpeech2ConformerWithHifiGan model)
- **flaubert** -- `FlaubertConfig` (FlauBERT model)
- **flava** -- `FlavaConfig` (FLAVA model)
- **flex_olmo** -- `FlexOlmoConfig` (FlexOlmo model)
- **florence2** -- `Florence2Config` (Florence2 model)
- **fnet** -- `FNetConfig` (FNet model)
- **focalnet** -- `FocalNetConfig` (FocalNet model)
- **fsmt** -- `FSMTConfig` (FairSeq Machine-Translation model)
- **funnel** -- `FunnelConfig` (Funnel Transformer model)
- **fuyu** -- `FuyuConfig` (Fuyu model)
- **gemma** -- [GemmaConfig](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaConfig) (Gemma model)
- **gemma2** -- [Gemma2Config](/docs/transformers/v4.57.1/ko/model_doc/gemma2#transformers.Gemma2Config) (Gemma2 model)
- **gemma3** -- [Gemma3Config](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3Config) (Gemma3ForConditionalGeneration model)
- **gemma3_text** -- [Gemma3TextConfig](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3TextConfig) (Gemma3ForCausalLM model)
- **gemma3n** -- `Gemma3nConfig` (Gemma3nForConditionalGeneration model)
- **gemma3n_audio** -- `Gemma3nAudioConfig` (Gemma3nAudioEncoder model)
- **gemma3n_text** -- `Gemma3nTextConfig` (Gemma3nForCausalLM model)
- **gemma3n_vision** -- `Gemma3nVisionConfig` (TimmWrapperModel model)
- **git** -- `GitConfig` (GIT model)
- **glm** -- `GlmConfig` (GLM model)
- **glm4** -- `Glm4Config` (GLM4 model)
- **glm4_moe** -- `Glm4MoeConfig` (Glm4MoE model)
- **glm4v** -- `Glm4vConfig` (GLM4V model)
- **glm4v_moe** -- `Glm4vMoeConfig` (GLM4VMOE model)
- **glm4v_moe_text** -- `Glm4vMoeTextConfig` (GLM4VMOE model)
- **glm4v_text** -- `Glm4vTextConfig` (GLM4V model)
- **glpn** -- `GLPNConfig` (GLPN model)
- **got_ocr2** -- `GotOcr2Config` (GOT-OCR2 model)
- **gpt-sw3** -- [GPT2Config](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Config) (GPT-Sw3 model)
- **gpt2** -- [GPT2Config](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Config) (OpenAI GPT-2 model)
- **gpt_bigcode** -- `GPTBigCodeConfig` (GPTBigCode model)
- **gpt_neo** -- `GPTNeoConfig` (GPT Neo model)
- **gpt_neox** -- `GPTNeoXConfig` (GPT NeoX model)
- **gpt_neox_japanese** -- [GPTNeoXJapaneseConfig](/docs/transformers/v4.57.1/ko/model_doc/gpt_neox_japanese#transformers.GPTNeoXJapaneseConfig) (GPT NeoX Japanese model)
- **gpt_oss** -- `GptOssConfig` (GptOss model)
- **gptj** -- `GPTJConfig` (GPT-J model)
- **gptsan-japanese** -- `GPTSanJapaneseConfig` (GPTSAN-japanese model)
- **granite** -- `GraniteConfig` (Granite model)
- **granite_speech** -- `GraniteSpeechConfig` (GraniteSpeech model)
- **granitemoe** -- `GraniteMoeConfig` (GraniteMoeMoe model)
- **granitemoehybrid** -- `GraniteMoeHybridConfig` (GraniteMoeHybrid model)
- **granitemoeshared** -- `GraniteMoeSharedConfig` (GraniteMoeSharedMoe model)
- **granitevision** -- `LlavaNextConfig` (LLaVA-NeXT model)
- **graphormer** -- [GraphormerConfig](/docs/transformers/v4.57.1/ko/model_doc/graphormer#transformers.GraphormerConfig) (Graphormer model)
- **grounding-dino** -- [GroundingDinoConfig](/docs/transformers/v4.57.1/ko/model_doc/grounding-dino#transformers.GroundingDinoConfig) (Grounding DINO model)
- **groupvit** -- `GroupViTConfig` (GroupViT model)
- **helium** -- `HeliumConfig` (Helium model)
- **hgnet_v2** -- `HGNetV2Config` (HGNet-V2 model)
- **hiera** -- `HieraConfig` (Hiera model)
- **hubert** -- `HubertConfig` (Hubert model)
- **hunyuan_v1_dense** -- `HunYuanDenseV1Config` (HunYuanDenseV1 model)
- **hunyuan_v1_moe** -- `HunYuanMoEV1Config` (HunYuanMoeV1 model)
- **ibert** -- `IBertConfig` (I-BERT model)
- **idefics** -- `IdeficsConfig` (IDEFICS model)
- **idefics2** -- `Idefics2Config` (Idefics2 model)
- **idefics3** -- `Idefics3Config` (Idefics3 model)
- **idefics3_vision** -- `Idefics3VisionConfig` (Idefics3VisionTransformer model)
- **ijepa** -- `IJepaConfig` (I-JEPA model)
- **imagegpt** -- `ImageGPTConfig` (ImageGPT model)
- **informer** -- [InformerConfig](/docs/transformers/v4.57.1/ko/model_doc/informer#transformers.InformerConfig) (Informer model)
- **instructblip** -- `InstructBlipConfig` (InstructBLIP model)
- **instructblipvideo** -- `InstructBlipVideoConfig` (InstructBlipVideo model)
- **internvl** -- `InternVLConfig` (InternVL model)
- **internvl_vision** -- `InternVLVisionConfig` (InternVLVision model)
- **jamba** -- [JambaConfig](/docs/transformers/v4.57.1/ko/model_doc/jamba#transformers.JambaConfig) (Jamba model)
- **janus** -- `JanusConfig` (Janus model)
- **jetmoe** -- `JetMoeConfig` (JetMoe model)
- **jukebox** -- `JukeboxConfig` (Jukebox model)
- **kosmos-2** -- `Kosmos2Config` (KOSMOS-2 model)
- **kosmos-2.5** -- `Kosmos2_5Config` (KOSMOS-2.5 model)
- **kyutai_speech_to_text** -- `KyutaiSpeechToTextConfig` (KyutaiSpeechToText model)
- **layoutlm** -- `LayoutLMConfig` (LayoutLM model)
- **layoutlmv2** -- `LayoutLMv2Config` (LayoutLMv2 model)
- **layoutlmv3** -- `LayoutLMv3Config` (LayoutLMv3 model)
- **led** -- `LEDConfig` (LED model)
- **levit** -- `LevitConfig` (LeViT model)
- **lfm2** -- `Lfm2Config` (Lfm2 model)
- **lfm2_vl** -- `Lfm2VlConfig` (Lfm2Vl model)
- **lightglue** -- `LightGlueConfig` (LightGlue model)
- **lilt** -- `LiltConfig` (LiLT model)
- **llama** -- [LlamaConfig](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaConfig) (LLaMA model)
- **llama4** -- `Llama4Config` (Llama4 model)
- **llama4_text** -- `Llama4TextConfig` (Llama4ForCausalLM model)
- **llava** -- `LlavaConfig` (LLaVa model)
- **llava_next** -- `LlavaNextConfig` (LLaVA-NeXT model)
- **llava_next_video** -- `LlavaNextVideoConfig` (LLaVa-NeXT-Video model)
- **llava_onevision** -- `LlavaOnevisionConfig` (LLaVA-Onevision model)
- **longcat_flash** -- `LongcatFlashConfig` (LongCatFlash model)
- **longformer** -- `LongformerConfig` (Longformer model)
- **longt5** -- `LongT5Config` (LongT5 model)
- **luke** -- `LukeConfig` (LUKE model)
- **lxmert** -- `LxmertConfig` (LXMERT model)
- **m2m_100** -- `M2M100Config` (M2M100 model)
- **mamba** -- [MambaConfig](/docs/transformers/v4.57.1/ko/model_doc/mamba#transformers.MambaConfig) (Mamba model)
- **mamba2** -- [Mamba2Config](/docs/transformers/v4.57.1/ko/model_doc/mamba2#transformers.Mamba2Config) (mamba2 model)
- **marian** -- [MarianConfig](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.MarianConfig) (Marian model)
- **markuplm** -- `MarkupLMConfig` (MarkupLM model)
- **mask2former** -- `Mask2FormerConfig` (Mask2Former model)
- **maskformer** -- `MaskFormerConfig` (MaskFormer model)
- **maskformer-swin** -- `MaskFormerSwinConfig` (MaskFormerSwin model)
- **mbart** -- `MBartConfig` (mBART model)
- **mctct** -- `MCTCTConfig` (M-CTC-T model)
- **mega** -- `MegaConfig` (MEGA model)
- **megatron-bert** -- `MegatronBertConfig` (Megatron-BERT model)
- **metaclip_2** -- `MetaClip2Config` (MetaCLIP 2 model)
- **mgp-str** -- `MgpstrConfig` (MGP-STR model)
- **mimi** -- `MimiConfig` (Mimi model)
- **minimax** -- `MiniMaxConfig` (MiniMax model)
- **ministral** -- `MinistralConfig` (Ministral model)
- **mistral** -- [MistralConfig](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralConfig) (Mistral model)
- **mistral3** -- `Mistral3Config` (Mistral3 model)
- **mixtral** -- `MixtralConfig` (Mixtral model)
- **mlcd** -- `MLCDVisionConfig` (MLCD model)
- **mllama** -- `MllamaConfig` (Mllama model)
- **mm-grounding-dino** -- `MMGroundingDinoConfig` (MM Grounding DINO model)
- **mobilebert** -- `MobileBertConfig` (MobileBERT model)
- **mobilenet_v1** -- `MobileNetV1Config` (MobileNetV1 model)
- **mobilenet_v2** -- `MobileNetV2Config` (MobileNetV2 model)
- **mobilevit** -- `MobileViTConfig` (MobileViT model)
- **mobilevitv2** -- `MobileViTV2Config` (MobileViTV2 model)
- **modernbert** -- `ModernBertConfig` (ModernBERT model)
- **modernbert-decoder** -- `ModernBertDecoderConfig` (ModernBertDecoder model)
- **moonshine** -- `MoonshineConfig` (Moonshine model)
- **moshi** -- `MoshiConfig` (Moshi model)
- **mpnet** -- `MPNetConfig` (MPNet model)
- **mpt** -- `MptConfig` (MPT model)
- **mra** -- `MraConfig` (MRA model)
- **mt5** -- `MT5Config` (MT5 model)
- **musicgen** -- `MusicgenConfig` (MusicGen model)
- **musicgen_melody** -- `MusicgenMelodyConfig` (MusicGen Melody model)
- **mvp** -- `MvpConfig` (MVP model)
- **nat** -- `NatConfig` (NAT model)
- **nemotron** -- `NemotronConfig` (Nemotron model)
- **nezha** -- `NezhaConfig` (Nezha model)
- **nllb-moe** -- `NllbMoeConfig` (NLLB-MOE model)
- **nougat** -- `VisionEncoderDecoderConfig` (Nougat model)
- **nystromformer** -- `NystromformerConfig` (Nyströmformer model)
- **olmo** -- `OlmoConfig` (OLMo model)
- **olmo2** -- `Olmo2Config` (OLMo2 model)
- **olmo3** -- `Olmo3Config` (Olmo3 model)
- **olmoe** -- `OlmoeConfig` (OLMoE model)
- **omdet-turbo** -- `OmDetTurboConfig` (OmDet-Turbo model)
- **oneformer** -- `OneFormerConfig` (OneFormer model)
- **open-llama** -- `OpenLlamaConfig` (OpenLlama model)
- **openai-gpt** -- [OpenAIGPTConfig](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTConfig) (OpenAI GPT model)
- **opt** -- `OPTConfig` (OPT model)
- **ovis2** -- `Ovis2Config` (Ovis2 model)
- **owlv2** -- `Owlv2Config` (OWLv2 model)
- **owlvit** -- `OwlViTConfig` (OWL-ViT model)
- **paligemma** -- [PaliGemmaConfig](/docs/transformers/v4.57.1/ko/model_doc/paligemma#transformers.PaliGemmaConfig) (PaliGemma model)
- **parakeet_ctc** -- `ParakeetCTCConfig` (Parakeet model)
- **parakeet_encoder** -- `ParakeetEncoderConfig` (ParakeetEncoder model)
- **patchtsmixer** -- [PatchTSMixerConfig](/docs/transformers/v4.57.1/ko/model_doc/patchtsmixer#transformers.PatchTSMixerConfig) (PatchTSMixer model)
- **patchtst** -- [PatchTSTConfig](/docs/transformers/v4.57.1/ko/model_doc/patchtst#transformers.PatchTSTConfig) (PatchTST model)
- **pegasus** -- `PegasusConfig` (Pegasus model)
- **pegasus_x** -- `PegasusXConfig` (PEGASUS-X model)
- **perceiver** -- `PerceiverConfig` (Perceiver model)
- **perception_encoder** -- `TimmWrapperConfig` (PerceptionEncoder model)
- **perception_lm** -- `PerceptionLMConfig` (PerceptionLM model)
- **persimmon** -- `PersimmonConfig` (Persimmon model)
- **phi** -- `PhiConfig` (Phi model)
- **phi3** -- `Phi3Config` (Phi3 model)
- **phi4_multimodal** -- `Phi4MultimodalConfig` (Phi4Multimodal model)
- **phimoe** -- `PhimoeConfig` (Phimoe model)
- **pix2struct** -- `Pix2StructConfig` (Pix2Struct model)
- **pixtral** -- `PixtralVisionConfig` (Pixtral model)
- **plbart** -- `PLBartConfig` (PLBart model)
- **poolformer** -- `PoolFormerConfig` (PoolFormer model)
- **pop2piano** -- `Pop2PianoConfig` (Pop2Piano model)
- **prompt_depth_anything** -- `PromptDepthAnythingConfig` (PromptDepthAnything model)
- **prophetnet** -- `ProphetNetConfig` (ProphetNet model)
- **pvt** -- `PvtConfig` (PVT model)
- **pvt_v2** -- `PvtV2Config` (PVTv2 model)
- **qdqbert** -- `QDQBertConfig` (QDQBert model)
- **qwen2** -- `Qwen2Config` (Qwen2 model)
- **qwen2_5_omni** -- `Qwen2_5OmniConfig` (Qwen2_5Omni model)
- **qwen2_5_vl** -- `Qwen2_5_VLConfig` (Qwen2_5_VL model)
- **qwen2_5_vl_text** -- `Qwen2_5_VLTextConfig` (Qwen2_5_VL model)
- **qwen2_audio** -- `Qwen2AudioConfig` (Qwen2Audio model)
- **qwen2_audio_encoder** -- `Qwen2AudioEncoderConfig` (Qwen2AudioEncoder model)
- **qwen2_moe** -- `Qwen2MoeConfig` (Qwen2MoE model)
- **qwen2_vl** -- [Qwen2VLConfig](/docs/transformers/v4.57.1/ko/model_doc/qwen2_vl#transformers.Qwen2VLConfig) (Qwen2VL model)
- **qwen2_vl_text** -- `Qwen2VLTextConfig` (Qwen2VL model)
- **qwen3** -- `Qwen3Config` (Qwen3 model)
- **qwen3_moe** -- `Qwen3MoeConfig` (Qwen3MoE model)
- **qwen3_next** -- `Qwen3NextConfig` (Qwen3Next model)
- **qwen3_omni_moe** -- `Qwen3OmniMoeConfig` (Qwen3OmniMoE model)
- **qwen3_vl** -- `Qwen3VLConfig` (Qwen3VL model)
- **qwen3_vl_moe** -- `Qwen3VLMoeConfig` (Qwen3VLMoe model)
- **qwen3_vl_moe_text** -- `Qwen3VLMoeTextConfig` (Qwen3VLMoe model)
- **qwen3_vl_text** -- `Qwen3VLTextConfig` (Qwen3VL model)
- **rag** -- [RagConfig](/docs/transformers/v4.57.1/ko/model_doc/rag#transformers.RagConfig) (RAG model)
- **realm** -- `RealmConfig` (REALM model)
- **recurrent_gemma** -- `RecurrentGemmaConfig` (RecurrentGemma model)
- **reformer** -- `ReformerConfig` (Reformer model)
- **regnet** -- `RegNetConfig` (RegNet model)
- **rembert** -- `RemBertConfig` (RemBERT model)
- **resnet** -- `ResNetConfig` (ResNet model)
- **retribert** -- `RetriBertConfig` (RetriBERT model)
- **roberta** -- [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) (RoBERTa model)
- **roberta-prelayernorm** -- `RobertaPreLayerNormConfig` (RoBERTa-PreLayerNorm model)
- **roc_bert** -- `RoCBertConfig` (RoCBert model)
- **roformer** -- `RoFormerConfig` (RoFormer model)
- **rt_detr** -- `RTDetrConfig` (RT-DETR model)
- **rt_detr_resnet** -- `RTDetrResNetConfig` (RT-DETR-ResNet model)
- **rt_detr_v2** -- `RTDetrV2Config` (RT-DETRv2 model)
- **rwkv** -- `RwkvConfig` (RWKV model)
- **sam** -- `SamConfig` (SAM model)
- **sam2** -- `Sam2Config` (SAM2 model)
- **sam2_hiera_det_model** -- `Sam2HieraDetConfig` (Sam2HieraDetModel model)
- **sam2_video** -- `Sam2VideoConfig` (Sam2VideoModel model)
- **sam2_vision_model** -- `Sam2VisionConfig` (Sam2VisionModel model)
- **sam_hq** -- `SamHQConfig` (SAM-HQ model)
- **sam_hq_vision_model** -- `SamHQVisionConfig` (SamHQVisionModel model)
- **sam_vision_model** -- `SamVisionConfig` (SamVisionModel model)
- **seamless_m4t** -- `SeamlessM4TConfig` (SeamlessM4T model)
- **seamless_m4t_v2** -- `SeamlessM4Tv2Config` (SeamlessM4Tv2 model)
- **seed_oss** -- `SeedOssConfig` (SeedOss model)
- **segformer** -- `SegformerConfig` (SegFormer model)
- **seggpt** -- `SegGptConfig` (SegGPT model)
- **sew** -- `SEWConfig` (SEW model)
- **sew-d** -- `SEWDConfig` (SEW-D model)
- **shieldgemma2** -- `ShieldGemma2Config` (Shieldgemma2 model)
- **siglip** -- [SiglipConfig](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipConfig) (SigLIP model)
- **siglip2** -- `Siglip2Config` (SigLIP2 model)
- **siglip2_vision_model** -- `Siglip2VisionConfig` (Siglip2VisionModel model)
- **siglip_vision_model** -- [SiglipVisionConfig](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipVisionConfig) (SiglipVisionModel model)
- **smollm3** -- `SmolLM3Config` (SmolLM3 model)
- **smolvlm** -- [SmolVLMConfig](/docs/transformers/v4.57.1/ko/model_doc/smolvlm#transformers.SmolVLMConfig) (SmolVLM model)
- **smolvlm_vision** -- [SmolVLMVisionConfig](/docs/transformers/v4.57.1/ko/model_doc/smolvlm#transformers.SmolVLMVisionConfig) (SmolVLMVisionTransformer model)
- **speech-encoder-decoder** -- `SpeechEncoderDecoderConfig` (Speech Encoder decoder model)
- **speech_to_text** -- `Speech2TextConfig` (Speech2Text model)
- **speech_to_text_2** -- `Speech2Text2Config` (Speech2Text2 model)
- **speecht5** -- `SpeechT5Config` (SpeechT5 model)
- **splinter** -- `SplinterConfig` (Splinter model)
- **squeezebert** -- `SqueezeBertConfig` (SqueezeBERT model)
- **stablelm** -- `StableLmConfig` (StableLm model)
- **starcoder2** -- `Starcoder2Config` (Starcoder2 model)
- **superglue** -- `SuperGlueConfig` (SuperGlue model)
- **superpoint** -- `SuperPointConfig` (SuperPoint model)
- **swiftformer** -- `SwiftFormerConfig` (SwiftFormer model)
- **swin** -- [SwinConfig](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.SwinConfig) (Swin Transformer model)
- **swin2sr** -- [Swin2SRConfig](/docs/transformers/v4.57.1/ko/model_doc/swin2sr#transformers.Swin2SRConfig) (Swin2SR model)
- **swinv2** -- [Swinv2Config](/docs/transformers/v4.57.1/ko/model_doc/swinv2#transformers.Swinv2Config) (Swin Transformer V2 model)
- **switch_transformers** -- `SwitchTransformersConfig` (SwitchTransformers model)
- **t5** -- `T5Config` (T5 model)
- **t5gemma** -- `T5GemmaConfig` (T5Gemma model)
- **table-transformer** -- `TableTransformerConfig` (Table Transformer model)
- **tapas** -- `TapasConfig` (TAPAS model)
- **textnet** -- `TextNetConfig` (TextNet model)
- **time_series_transformer** -- [TimeSeriesTransformerConfig](/docs/transformers/v4.57.1/ko/model_doc/time_series_transformer#transformers.TimeSeriesTransformerConfig) (Time Series Transformer model)
- **timesfm** -- `TimesFmConfig` (TimesFm model)
- **timesformer** -- [TimesformerConfig](/docs/transformers/v4.57.1/ko/model_doc/timesformer#transformers.TimesformerConfig) (TimeSformer model)
- **timm_backbone** -- `TimmBackboneConfig` (TimmBackbone model)
- **timm_wrapper** -- `TimmWrapperConfig` (TimmWrapperModel model)
- **trajectory_transformer** -- [TrajectoryTransformerConfig](/docs/transformers/v4.57.1/ko/model_doc/trajectory_transformer#transformers.TrajectoryTransformerConfig) (Trajectory Transformer model)
- **transfo-xl** -- `TransfoXLConfig` (Transformer-XL model)
- **trocr** -- `TrOCRConfig` (TrOCR model)
- **tvlt** -- `TvltConfig` (TVLT model)
- **tvp** -- [TvpConfig](/docs/transformers/v4.57.1/ko/model_doc/tvp#transformers.TvpConfig) (TVP model)
- **udop** -- `UdopConfig` (UDOP model)
- **umt5** -- `UMT5Config` (UMT5 model)
- **unispeech** -- `UniSpeechConfig` (UniSpeech model)
- **unispeech-sat** -- `UniSpeechSatConfig` (UniSpeechSat model)
- **univnet** -- `UnivNetConfig` (UnivNet model)
- **upernet** -- `UperNetConfig` (UPerNet model)
- **van** -- `VanConfig` (VAN model)
- **vaultgemma** -- `VaultGemmaConfig` (VaultGemma model)
- **video_llava** -- `VideoLlavaConfig` (VideoLlava model)
- **videomae** -- `VideoMAEConfig` (VideoMAE model)
- **vilt** -- `ViltConfig` (ViLT model)
- **vipllava** -- `VipLlavaConfig` (VipLlava model)
- **vision-encoder-decoder** -- `VisionEncoderDecoderConfig` (Vision Encoder decoder model)
- **vision-text-dual-encoder** -- `VisionTextDualEncoderConfig` (VisionTextDualEncoder model)
- **visual_bert** -- `VisualBertConfig` (VisualBERT model)
- **vit** -- [ViTConfig](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTConfig) (ViT model)
- **vit_hybrid** -- `ViTHybridConfig` (ViT Hybrid model)
- **vit_mae** -- `ViTMAEConfig` (ViTMAE model)
- **vit_msn** -- `ViTMSNConfig` (ViTMSN model)
- **vitdet** -- `VitDetConfig` (VitDet model)
- **vitmatte** -- `VitMatteConfig` (ViTMatte model)
- **vitpose** -- `VitPoseConfig` (ViTPose model)
- **vitpose_backbone** -- `VitPoseBackboneConfig` (ViTPoseBackbone model)
- **vits** -- `VitsConfig` (VITS model)
- **vivit** -- [VivitConfig](/docs/transformers/v4.57.1/ko/model_doc/vivit#transformers.VivitConfig) (ViViT model)
- **vjepa2** -- `VJEPA2Config` (VJEPA2Model model)
- **voxtral** -- `VoxtralConfig` (Voxtral model)
- **voxtral_encoder** -- `VoxtralEncoderConfig` (Voxtral Encoder model)
- **wav2vec2** -- `Wav2Vec2Config` (Wav2Vec2 model)
- **wav2vec2-bert** -- `Wav2Vec2BertConfig` (Wav2Vec2-BERT model)
- **wav2vec2-conformer** -- `Wav2Vec2ConformerConfig` (Wav2Vec2-Conformer model)
- **wavlm** -- `WavLMConfig` (WavLM model)
- **whisper** -- [WhisperConfig](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperConfig) (Whisper model)
- **xclip** -- [XCLIPConfig](/docs/transformers/v4.57.1/ko/model_doc/xclip#transformers.XCLIPConfig) (X-CLIP model)
- **xcodec** -- `XcodecConfig` (X-CODEC model)
- **xglm** -- `XGLMConfig` (XGLM model)
- **xlm** -- `XLMConfig` (XLM model)
- **xlm-prophetnet** -- `XLMProphetNetConfig` (XLM-ProphetNet model)
- **xlm-roberta** -- `XLMRobertaConfig` (XLM-RoBERTa model)
- **xlm-roberta-xl** -- `XLMRobertaXLConfig` (XLM-RoBERTa-XL model)
- **xlnet** -- `XLNetConfig` (XLNet model)
- **xlstm** -- `xLSTMConfig` (xLSTM model)
- **xmod** -- `XmodConfig` (X-MOD model)
- **yolos** -- `YolosConfig` (YOLOS model)
- **yoso** -- `YosoConfig` (YOSO model)
- **zamba** -- `ZambaConfig` (Zamba model)
- **zamba2** -- `Zamba2Config` (Zamba2 model)
- **zoedepth** -- `ZoeDepthConfig` (ZoeDepth model)

Examples:

```python
>>> from transformers import AutoConfig

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-uncased")

>>> # Download configuration from huggingface.co (user-uploaded) and cache.
>>> config = AutoConfig.from_pretrained("dbmdz/bert-base-german-cased")

>>> # If configuration file is in a directory (e.g., was saved using *save_pretrained('./test/saved_model/')*).
>>> config = AutoConfig.from_pretrained("./test/bert_saved_model/")

>>> # Load a specific configuration file.
>>> config = AutoConfig.from_pretrained("./test/bert_saved_model/my_configuration.json")

>>> # Change some config attributes when loading a pretrained config.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-uncased", output_attentions=True, foo=False)
>>> config.output_attentions
True

>>> config, unused_kwargs = AutoConfig.from_pretrained(
...     "google-bert/bert-base-uncased", output_attentions=True, foo=False, return_unused_kwargs=True
... )
>>> config.output_attentions
True

>>> unused_kwargs
{'foo': False}
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model configuration hosted inside a model repo on huggingface.co. - A path to a *directory* containing a configuration file saved using the [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.save_pretrained) method, or the [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) method, e.g., `./my_model_directory/`. - A path or url to a saved configuration JSON *file*, e.g., `./my_model_directory/configuration.json`.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download the model weights and configuration files and override the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

return_unused_kwargs (`bool`, *optional*, defaults to `False`) : If `False`, then this function returns just the final configuration object.  If `True`, then this functions returns a `Tuple(config, unused_kwargs)` where *unused_kwargs* is a dictionary consisting of the key/value pairs whose keys are not configuration attributes: i.e., the part of `kwargs` which has not been used to update `config` and is otherwise ignored.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

kwargs(additional keyword arguments, *optional*) : The values in kwargs of any keys which are configuration attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are *not* configuration attributes is controlled by the `return_unused_kwargs` keyword parameter.
#### register[[transformers.AutoConfig.register]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/configuration_auto.py#L1386)

Register a new configuration for this class.

**Parameters:**

model_type (`str`) : The model type like "bert" or "gpt".

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The config to register.

## AutoTokenizer[[transformers.AutoTokenizer]][[transformers.AutoTokenizer]]

#### transformers.AutoTokenizer[[transformers.AutoTokenizer]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/tokenization_auto.py#L948)

This is a generic tokenizer class that will be instantiated as one of the tokenizer classes of the library when
created with the [AutoTokenizer.from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoTokenizer.from_pretrained) class method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_pretrainedtransformers.AutoTokenizer.from_pretrainedhttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/tokenization_auto.py#L962[{"name": "pretrained_model_name_or_path", "val": ""}, {"name": "*inputs", "val": ""}, {"name": "**kwargs", "val": ""}]- **pretrained_model_name_or_path** (`str` or `os.PathLike`) --
  Can be either:

  - A string, the *model id* of a predefined tokenizer hosted inside a model repo on huggingface.co.
  - A path to a *directory* containing vocabulary files required by the tokenizer, for instance saved
    using the [save_pretrained()](/docs/transformers/v4.57.1/ko/internal/tokenization_utils#transformers.PreTrainedTokenizerBase.save_pretrained) method, e.g., `./my_model_directory/`.
  - A path or url to a single saved vocabulary file if and only if the tokenizer only requires a
    single vocabulary file (like Bert or XLNet), e.g.: `./my_model_directory/vocab.txt`. (Not
    applicable to all derived classes)
- **inputs** (additional positional arguments, *optional*) --
  Will be passed along to the Tokenizer `__init__()` method.
- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) --
  The configuration object used to determine the tokenizer class to instantiate.
- **cache_dir** (`str` or `os.PathLike`, *optional*) --
  Path to a directory in which a downloaded pretrained model configuration should be cached if the
  standard cache should not be used.
- **force_download** (`bool`, *optional*, defaults to `False`) --
  Whether or not to force the (re-)download the model weights and configuration files and override the
  cached versions if they exist.
- **resume_download** --
  Deprecated and ignored. All downloads are now resumed by default when possible.
  Will be removed in v5 of Transformers.
- **proxies** (`dict[str, str]`, *optional*) --
  A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128',
  'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.
- **revision** (`str`, *optional*, defaults to `"main"`) --
  The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a
  git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any
  identifier allowed by git.
- **subfolder** (`str`, *optional*) --
  In case the relevant files are located inside a subfolder of the model repo on huggingface.co (e.g. for
  facebook/rag-token-base), specify it here.
- **use_fast** (`bool`, *optional*, defaults to `True`) --
  Use a [fast Rust-based tokenizer](https://huggingface.co/docs/tokenizers/index) if it is supported for
  a given model. If a fast tokenizer is not available for a given model, a normal Python-based tokenizer
  is returned instead.
- **tokenizer_type** (`str`, *optional*) --
  Tokenizer type to be loaded.
- **trust_remote_code** (`bool`, *optional*, defaults to `False`) --
  Whether or not to allow for custom models defined on the Hub in their own modeling files. This option
  should only be set to `True` for repositories you trust and in which you have read the code, as it will
  execute code present on the Hub on your local machine.
- **kwargs** (additional keyword arguments, *optional*) --
  Will be passed to the Tokenizer `__init__()` method. Can be used to set special tokens like
  `bos_token`, `eos_token`, `unk_token`, `sep_token`, `pad_token`, `cls_token`, `mask_token`,
  `additional_special_tokens`. See parameters in the `__init__()` for more details.0

Instantiate one of the tokenizer classes of the library from a pretrained model vocabulary.

The tokenizer class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **aimv2** -- [CLIPTokenizer](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPTokenizer) or [CLIPTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPTokenizerFast) (AIMv2 model)
- **albert** -- `AlbertTokenizer` or [AlbertTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertTokenizerFast) (ALBERT model)
- **align** -- [BertTokenizer](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizerFast) (ALIGN model)
- **arcee** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (Arcee model)
- **aria** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (Aria model)
- **aya_vision** -- [CohereTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/cohere#transformers.CohereTokenizerFast) (AyaVision model)
- **bark** -- [BertTokenizer](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizerFast) (Bark model)
- **bart** -- [BartTokenizer](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartTokenizer) or [BartTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartTokenizerFast) (BART model)
- **barthez** -- [BarthezTokenizer](/docs/transformers/v4.57.1/ko/model_doc/barthez#transformers.BarthezTokenizer) or [BarthezTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/barthez#transformers.BarthezTokenizerFast) (BARThez model)
- **bartpho** -- [BartphoTokenizer](/docs/transformers/v4.57.1/ko/model_doc/bartpho#transformers.BartphoTokenizer) (BARTpho model)
- **bert** -- [BertTokenizer](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizerFast) (BERT model)
- **bert-generation** -- `BertGenerationTokenizer` (Bert Generation model)
- **bert-japanese** -- [BertJapaneseTokenizer](/docs/transformers/v4.57.1/ko/model_doc/bert-japanese#transformers.BertJapaneseTokenizer) (BertJapanese model)
- **bertweet** -- [BertweetTokenizer](/docs/transformers/v4.57.1/ko/model_doc/bertweet#transformers.BertweetTokenizer) (BERTweet model)
- **big_bird** -- `BigBirdTokenizer` or `BigBirdTokenizerFast` (BigBird model)
- **bigbird_pegasus** -- `PegasusTokenizer` or `PegasusTokenizerFast` (BigBird-Pegasus model)
- **biogpt** -- [BioGptTokenizer](/docs/transformers/v4.57.1/ko/model_doc/biogpt#transformers.BioGptTokenizer) (BioGpt model)
- **bitnet** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ko/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (BitNet model)
- **blenderbot** -- `BlenderbotTokenizer` or `BlenderbotTokenizerFast` (Blenderbot model)
- **blenderbot-small** -- `BlenderbotSmallTokenizer` (BlenderbotSmall model)
- **blip** -- [BertTokenizer](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizerFast) (BLIP model)
- **blip-2** -- [GPT2Tokenizer](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Tokenizer) or [GPT2TokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2TokenizerFast) (BLIP-2 model)
- **bloom** -- `BloomTokenizerFast` (BLOOM model)
- **blt** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ko/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (Blt model)
- **bridgetower** -- [RobertaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaTokenizer) or [RobertaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaTokenizerFast) (BridgeTower model)
- **bros** -- [BertTokenizer](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizerFast) (BROS model)
- **byt5** -- `ByT5Tokenizer` (ByT5 model)
- **camembert** -- `CamembertTokenizer` or `CamembertTokenizerFast` (CamemBERT model)
- **canine** -- `CanineTokenizer` (CANINE model)
- **chameleon** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (Chameleon model)
- **chinese_clip** -- [BertTokenizer](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizerFast) (Chinese-CLIP model)
- **clap** -- [RobertaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaTokenizer) or [RobertaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaTokenizerFast) (CLAP model)
- **clip** -- [CLIPTokenizer](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPTokenizer) or [CLIPTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPTokenizerFast) (CLIP model)
- **clipseg** -- [CLIPTokenizer](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPTokenizer) or [CLIPTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPTokenizerFast) (CLIPSeg model)
- **clvp** -- `ClvpTokenizer` (CLVP model)
- **code_llama** -- `CodeLlamaTokenizer` or `CodeLlamaTokenizerFast` (CodeLlama model)
- **codegen** -- [CodeGenTokenizer](/docs/transformers/v4.57.1/ko/model_doc/codegen#transformers.CodeGenTokenizer) or [CodeGenTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/codegen#transformers.CodeGenTokenizerFast) (CodeGen model)
- **cohere** -- [CohereTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/cohere#transformers.CohereTokenizerFast) (Cohere model)
- **cohere2** -- [CohereTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/cohere#transformers.CohereTokenizerFast) (Cohere2 model)
- **colpali** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (ColPali model)
- **colqwen2** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (ColQwen2 model)
- **convbert** -- [ConvBertTokenizer](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertTokenizer) or [ConvBertTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertTokenizerFast) (ConvBERT model)
- **cpm** -- `CpmTokenizer` or `CpmTokenizerFast` (CPM model)
- **cpmant** -- `CpmAntTokenizer` (CPM-Ant model)
- **csm** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ko/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (CSM model)
- **ctrl** -- `CTRLTokenizer` (CTRL model)
- **data2vec-audio** -- `Wav2Vec2CTCTokenizer` (Data2VecAudio model)
- **data2vec-text** -- [RobertaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaTokenizer) or [RobertaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaTokenizerFast) (Data2VecText model)
- **dbrx** -- [GPT2Tokenizer](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Tokenizer) or [GPT2TokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2TokenizerFast) (DBRX model)
- **deberta** -- [DebertaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaTokenizer) or [DebertaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaTokenizerFast) (DeBERTa model)
- **deberta-v2** -- [DebertaV2Tokenizer](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Tokenizer) or [DebertaV2TokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2TokenizerFast) (DeBERTa-v2 model)
- **deepseek_v2** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (DeepSeek-V2 model)
- **deepseek_v3** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (DeepSeek-V3 model)
- **deepseek_vl** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (DeepseekVL model)
- **deepseek_vl_hybrid** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (DeepseekVLHybrid model)
- **dia** -- `DiaTokenizer` (Dia model)
- **diffllama** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (DiffLlama model)
- **distilbert** -- `DistilBertTokenizer` or `DistilBertTokenizerFast` (DistilBERT model)
- **dpr** -- `DPRQuestionEncoderTokenizer` or `DPRQuestionEncoderTokenizerFast` (DPR model)
- **electra** -- [ElectraTokenizer](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraTokenizer) or [ElectraTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraTokenizerFast) (ELECTRA model)
- **emu3** -- [GPT2Tokenizer](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Tokenizer) or [GPT2TokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2TokenizerFast) (Emu3 model)
- **ernie** -- [BertTokenizer](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizerFast) (ERNIE model)
- **ernie4_5** -- [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (Ernie4_5 model)
- **ernie4_5_moe** -- [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (Ernie4_5_MoE model)
- **ernie_m** -- `ErnieMTokenizer` (ErnieM model)
- **esm** -- [EsmTokenizer](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmTokenizer) (ESM model)
- **exaone4** -- [GPT2Tokenizer](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Tokenizer) or [GPT2TokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2TokenizerFast) (EXAONE-4.0 model)
- **falcon** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ko/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (Falcon model)
- **falcon_mamba** -- `GPTNeoXTokenizerFast` (FalconMamba model)
- **fastspeech2_conformer** --  (FastSpeech2Conformer model)
- **flaubert** -- `FlaubertTokenizer` (FlauBERT model)
- **flex_olmo** -- [GPT2TokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2TokenizerFast) (FlexOlmo model)
- **fnet** -- `FNetTokenizer` or `FNetTokenizerFast` (FNet model)
- **fsmt** -- `FSMTTokenizer` (FairSeq Machine-Translation model)
- **funnel** -- `FunnelTokenizer` or `FunnelTokenizerFast` (Funnel Transformer model)
- **gemma** -- [GemmaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaTokenizer) or [GemmaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaTokenizerFast) (Gemma model)
- **gemma2** -- [GemmaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaTokenizer) or [GemmaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaTokenizerFast) (Gemma2 model)
- **gemma3** -- [GemmaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaTokenizer) or [GemmaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaTokenizerFast) (Gemma3ForConditionalGeneration model)
- **gemma3_text** -- [GemmaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaTokenizer) or [GemmaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaTokenizerFast) (Gemma3ForCausalLM model)
- **gemma3n** -- [GemmaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaTokenizer) or [GemmaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaTokenizerFast) (Gemma3nForConditionalGeneration model)
- **gemma3n_text** -- [GemmaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaTokenizer) or [GemmaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaTokenizerFast) (Gemma3nForCausalLM model)
- **git** -- [BertTokenizer](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizerFast) (GIT model)
- **glm** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ko/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (GLM model)
- **glm4** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ko/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (GLM4 model)
- **glm4_moe** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ko/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (Glm4MoE model)
- **glm4v** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ko/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (GLM4V model)
- **glm4v_moe** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ko/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (GLM4VMOE model)
- **gpt-sw3** -- `GPTSw3Tokenizer` (GPT-Sw3 model)
- **gpt2** -- [GPT2Tokenizer](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Tokenizer) or [GPT2TokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2TokenizerFast) (OpenAI GPT-2 model)
- **gpt_bigcode** -- [GPT2Tokenizer](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Tokenizer) or [GPT2TokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2TokenizerFast) (GPTBigCode model)
- **gpt_neo** -- [GPT2Tokenizer](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Tokenizer) or [GPT2TokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2TokenizerFast) (GPT Neo model)
- **gpt_neox** -- `GPTNeoXTokenizerFast` (GPT NeoX model)
- **gpt_neox_japanese** -- [GPTNeoXJapaneseTokenizer](/docs/transformers/v4.57.1/ko/model_doc/gpt_neox_japanese#transformers.GPTNeoXJapaneseTokenizer) (GPT NeoX Japanese model)
- **gpt_oss** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ko/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (GptOss model)
- **gptj** -- [GPT2Tokenizer](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Tokenizer) or [GPT2TokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2TokenizerFast) (GPT-J model)
- **gptsan-japanese** -- `GPTSanJapaneseTokenizer` (GPTSAN-japanese model)
- **granite** -- [GPT2Tokenizer](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Tokenizer) (Granite model)
- **granitemoe** -- [GPT2Tokenizer](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Tokenizer) (GraniteMoeMoe model)
- **granitemoehybrid** -- [GPT2Tokenizer](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Tokenizer) (GraniteMoeHybrid model)
- **granitemoeshared** -- [GPT2Tokenizer](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Tokenizer) (GraniteMoeSharedMoe model)
- **grounding-dino** -- [BertTokenizer](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizerFast) (Grounding DINO model)
- **groupvit** -- [CLIPTokenizer](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPTokenizer) or [CLIPTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPTokenizerFast) (GroupViT model)
- **helium** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ko/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (Helium model)
- **herbert** -- `HerbertTokenizer` or `HerbertTokenizerFast` (HerBERT model)
- **hubert** -- `Wav2Vec2CTCTokenizer` (Hubert model)
- **ibert** -- [RobertaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaTokenizer) or [RobertaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaTokenizerFast) (I-BERT model)
- **idefics** -- [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (IDEFICS model)
- **idefics2** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (Idefics2 model)
- **idefics3** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (Idefics3 model)
- **instructblip** -- [GPT2Tokenizer](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Tokenizer) or [GPT2TokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2TokenizerFast) (InstructBLIP model)
- **instructblipvideo** -- [GPT2Tokenizer](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Tokenizer) or [GPT2TokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2TokenizerFast) (InstructBlipVideo model)
- **internvl** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (InternVL model)
- **jamba** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (Jamba model)
- **janus** -- [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (Janus model)
- **jetmoe** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (JetMoe model)
- **jukebox** -- `JukeboxTokenizer` (Jukebox model)
- **kosmos-2** -- `XLMRobertaTokenizer` or `XLMRobertaTokenizerFast` (KOSMOS-2 model)
- **kosmos-2.5** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ko/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (KOSMOS-2.5 model)
- **layoutlm** -- `LayoutLMTokenizer` or `LayoutLMTokenizerFast` (LayoutLM model)
- **layoutlmv2** -- `LayoutLMv2Tokenizer` or `LayoutLMv2TokenizerFast` (LayoutLMv2 model)
- **layoutlmv3** -- `LayoutLMv3Tokenizer` or `LayoutLMv3TokenizerFast` (LayoutLMv3 model)
- **layoutxlm** -- `LayoutXLMTokenizer` or `LayoutXLMTokenizerFast` (LayoutXLM model)
- **led** -- `LEDTokenizer` or `LEDTokenizerFast` (LED model)
- **lilt** -- `LayoutLMv3Tokenizer` or `LayoutLMv3TokenizerFast` (LiLT model)
- **llama** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (LLaMA model)
- **llama4** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (Llama4 model)
- **llama4_text** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (Llama4ForCausalLM model)
- **llava** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (LLaVa model)
- **llava_next** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (LLaVA-NeXT model)
- **llava_next_video** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (LLaVa-NeXT-Video model)
- **llava_onevision** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (LLaVA-Onevision model)
- **longformer** -- `LongformerTokenizer` or `LongformerTokenizerFast` (Longformer model)
- **longt5** -- `T5Tokenizer` or `T5TokenizerFast` (LongT5 model)
- **luke** -- `LukeTokenizer` (LUKE model)
- **lxmert** -- `LxmertTokenizer` or `LxmertTokenizerFast` (LXMERT model)
- **m2m_100** -- `M2M100Tokenizer` (M2M100 model)
- **mamba** -- `GPTNeoXTokenizerFast` (Mamba model)
- **mamba2** -- `GPTNeoXTokenizerFast` (mamba2 model)
- **marian** -- [MarianTokenizer](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.MarianTokenizer) (Marian model)
- **mbart** -- `MBartTokenizer` or `MBartTokenizerFast` (mBART model)
- **mbart50** -- `MBart50Tokenizer` or `MBart50TokenizerFast` (mBART-50 model)
- **mega** -- [RobertaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaTokenizer) or [RobertaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaTokenizerFast) (MEGA model)
- **megatron-bert** -- [BertTokenizer](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizerFast) (Megatron-BERT model)
- **metaclip_2** -- `XLMRobertaTokenizer` or `XLMRobertaTokenizerFast` (MetaCLIP 2 model)
- **mgp-str** -- `MgpstrTokenizer` (MGP-STR model)
- **minimax** -- [GPT2Tokenizer](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Tokenizer) or [GPT2TokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2TokenizerFast) (MiniMax model)
- **ministral** -- `MistralCommonTokenizer` (Ministral model)
- **mistral** -- `MistralCommonTokenizer` (Mistral model)
- **mistral3** -- `MistralCommonTokenizer` (Mistral3 model)
- **mixtral** -- `MistralCommonTokenizer` (Mixtral model)
- **mllama** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (Mllama model)
- **mluke** -- `MLukeTokenizer` (mLUKE model)
- **mm-grounding-dino** -- [BertTokenizer](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizerFast) (MM Grounding DINO model)
- **mobilebert** -- `MobileBertTokenizer` or `MobileBertTokenizerFast` (MobileBERT model)
- **modernbert** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ko/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (ModernBERT model)
- **moonshine** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ko/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (Moonshine model)
- **moshi** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ko/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (Moshi model)
- **mpnet** -- `MPNetTokenizer` or `MPNetTokenizerFast` (MPNet model)
- **mpt** -- `GPTNeoXTokenizerFast` (MPT model)
- **mra** -- [RobertaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaTokenizer) or [RobertaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaTokenizerFast) (MRA model)
- **mt5** -- `MT5Tokenizer` or `MT5TokenizerFast` (MT5 model)
- **musicgen** -- `T5Tokenizer` or `T5TokenizerFast` (MusicGen model)
- **musicgen_melody** -- `T5Tokenizer` or `T5TokenizerFast` (MusicGen Melody model)
- **mvp** -- `MvpTokenizer` or `MvpTokenizerFast` (MVP model)
- **myt5** -- `MyT5Tokenizer` (myt5 model)
- **nemotron** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ko/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (Nemotron model)
- **nezha** -- [BertTokenizer](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizerFast) (Nezha model)
- **nllb** -- `NllbTokenizer` or `NllbTokenizerFast` (NLLB model)
- **nllb-moe** -- `NllbTokenizer` or `NllbTokenizerFast` (NLLB-MOE model)
- **nystromformer** -- `AlbertTokenizer` or [AlbertTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertTokenizerFast) (Nyströmformer model)
- **olmo** -- `GPTNeoXTokenizerFast` (OLMo model)
- **olmo2** -- `GPTNeoXTokenizerFast` (OLMo2 model)
- **olmo3** -- [GPT2TokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2TokenizerFast) (Olmo3 model)
- **olmoe** -- `GPTNeoXTokenizerFast` (OLMoE model)
- **omdet-turbo** -- [CLIPTokenizer](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPTokenizer) or [CLIPTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPTokenizerFast) (OmDet-Turbo model)
- **oneformer** -- [CLIPTokenizer](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPTokenizer) or [CLIPTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPTokenizerFast) (OneFormer model)
- **openai-gpt** -- [OpenAIGPTTokenizer](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTTokenizer) or [OpenAIGPTTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTTokenizerFast) (OpenAI GPT model)
- **opt** -- [GPT2Tokenizer](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Tokenizer) or [GPT2TokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2TokenizerFast) (OPT model)
- **owlv2** -- [CLIPTokenizer](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPTokenizer) or [CLIPTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPTokenizerFast) (OWLv2 model)
- **owlvit** -- [CLIPTokenizer](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPTokenizer) or [CLIPTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPTokenizerFast) (OWL-ViT model)
- **paligemma** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (PaliGemma model)
- **parakeet** -- `ParakeetCTCTokenizer` (Parakeet model)
- **pegasus** -- `PegasusTokenizer` or `PegasusTokenizerFast` (Pegasus model)
- **pegasus_x** -- `PegasusTokenizer` or `PegasusTokenizerFast` (PEGASUS-X model)
- **perceiver** -- `PerceiverTokenizer` (Perceiver model)
- **persimmon** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (Persimmon model)
- **phi** -- [CodeGenTokenizer](/docs/transformers/v4.57.1/ko/model_doc/codegen#transformers.CodeGenTokenizer) or [CodeGenTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/codegen#transformers.CodeGenTokenizerFast) (Phi model)
- **phi3** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (Phi3 model)
- **phimoe** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (Phimoe model)
- **phobert** -- `PhobertTokenizer` (PhoBERT model)
- **pix2struct** -- `T5Tokenizer` or `T5TokenizerFast` (Pix2Struct model)
- **pixtral** -- `MistralCommonTokenizer` (Pixtral model)
- **plbart** -- `PLBartTokenizer` (PLBart model)
- **prophetnet** -- `ProphetNetTokenizer` (ProphetNet model)
- **qdqbert** -- [BertTokenizer](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizerFast) (QDQBert model)
- **qwen2** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (Qwen2 model)
- **qwen2_5_omni** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (Qwen2_5Omni model)
- **qwen2_5_vl** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (Qwen2_5_VL model)
- **qwen2_audio** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (Qwen2Audio model)
- **qwen2_moe** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (Qwen2MoE model)
- **qwen2_vl** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (Qwen2VL model)
- **qwen3** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (Qwen3 model)
- **qwen3_moe** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (Qwen3MoE model)
- **qwen3_next** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (Qwen3Next model)
- **qwen3_omni_moe** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (Qwen3OmniMoE model)
- **qwen3_vl** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (Qwen3VL model)
- **qwen3_vl_moe** -- `Qwen2Tokenizer` or `Qwen2TokenizerFast` (Qwen3VLMoe model)
- **rag** -- [RagTokenizer](/docs/transformers/v4.57.1/ko/model_doc/rag#transformers.RagTokenizer) (RAG model)
- **realm** -- `RealmTokenizer` or `RealmTokenizerFast` (REALM model)
- **recurrent_gemma** -- [GemmaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaTokenizer) or [GemmaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaTokenizerFast) (RecurrentGemma model)
- **reformer** -- `ReformerTokenizer` or `ReformerTokenizerFast` (Reformer model)
- **rembert** -- `RemBertTokenizer` or `RemBertTokenizerFast` (RemBERT model)
- **retribert** -- `RetriBertTokenizer` or `RetriBertTokenizerFast` (RetriBERT model)
- **roberta** -- [RobertaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaTokenizer) or [RobertaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaTokenizerFast) (RoBERTa model)
- **roberta-prelayernorm** -- [RobertaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaTokenizer) or [RobertaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaTokenizerFast) (RoBERTa-PreLayerNorm model)
- **roc_bert** -- `RoCBertTokenizer` (RoCBert model)
- **roformer** -- `RoFormerTokenizer` or `RoFormerTokenizerFast` (RoFormer model)
- **rwkv** -- `GPTNeoXTokenizerFast` (RWKV model)
- **seamless_m4t** -- `SeamlessM4TTokenizer` or `SeamlessM4TTokenizerFast` (SeamlessM4T model)
- **seamless_m4t_v2** -- `SeamlessM4TTokenizer` or `SeamlessM4TTokenizerFast` (SeamlessM4Tv2 model)
- **shieldgemma2** -- [GemmaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaTokenizer) or [GemmaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaTokenizerFast) (Shieldgemma2 model)
- **siglip** -- [SiglipTokenizer](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipTokenizer) (SigLIP model)
- **siglip2** -- [GemmaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaTokenizer) or [GemmaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaTokenizerFast) (SigLIP2 model)
- **smollm3** -- [PreTrainedTokenizerFast](/docs/transformers/v4.57.1/ko/main_classes/tokenizer#transformers.PreTrainedTokenizerFast) (SmolLM3 model)
- **speech_to_text** -- `Speech2TextTokenizer` (Speech2Text model)
- **speech_to_text_2** -- `Speech2Text2Tokenizer` (Speech2Text2 model)
- **speecht5** -- `SpeechT5Tokenizer` (SpeechT5 model)
- **splinter** -- `SplinterTokenizer` or `SplinterTokenizerFast` (Splinter model)
- **squeezebert** -- `SqueezeBertTokenizer` or `SqueezeBertTokenizerFast` (SqueezeBERT model)
- **stablelm** -- `GPTNeoXTokenizerFast` (StableLm model)
- **starcoder2** -- [GPT2Tokenizer](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Tokenizer) or [GPT2TokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2TokenizerFast) (Starcoder2 model)
- **switch_transformers** -- `T5Tokenizer` or `T5TokenizerFast` (SwitchTransformers model)
- **t5** -- `T5Tokenizer` or `T5TokenizerFast` (T5 model)
- **t5gemma** -- [GemmaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaTokenizer) or [GemmaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaTokenizerFast) (T5Gemma model)
- **tapas** -- `TapasTokenizer` (TAPAS model)
- **tapex** -- `TapexTokenizer` (TAPEX model)
- **transfo-xl** -- `TransfoXLTokenizer` (Transformer-XL model)
- **tvp** -- [BertTokenizer](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizerFast) (TVP model)
- **udop** -- `UdopTokenizer` or `UdopTokenizerFast` (UDOP model)
- **umt5** -- `T5Tokenizer` or `T5TokenizerFast` (UMT5 model)
- **video_llava** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (VideoLlava model)
- **vilt** -- [BertTokenizer](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizerFast) (ViLT model)
- **vipllava** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (VipLlava model)
- **visual_bert** -- [BertTokenizer](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizer) or [BertTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertTokenizerFast) (VisualBERT model)
- **vits** -- `VitsTokenizer` (VITS model)
- **voxtral** -- `MistralCommonTokenizer` (Voxtral model)
- **wav2vec2** -- `Wav2Vec2CTCTokenizer` (Wav2Vec2 model)
- **wav2vec2-bert** -- `Wav2Vec2CTCTokenizer` (Wav2Vec2-BERT model)
- **wav2vec2-conformer** -- `Wav2Vec2CTCTokenizer` (Wav2Vec2-Conformer model)
- **wav2vec2_phoneme** -- `Wav2Vec2PhonemeCTCTokenizer` (Wav2Vec2Phoneme model)
- **whisper** -- [WhisperTokenizer](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperTokenizer) or [WhisperTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperTokenizerFast) (Whisper model)
- **xclip** -- [CLIPTokenizer](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPTokenizer) or [CLIPTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPTokenizerFast) (X-CLIP model)
- **xglm** -- `XGLMTokenizer` or `XGLMTokenizerFast` (XGLM model)
- **xlm** -- `XLMTokenizer` (XLM model)
- **xlm-prophetnet** -- `XLMProphetNetTokenizer` (XLM-ProphetNet model)
- **xlm-roberta** -- `XLMRobertaTokenizer` or `XLMRobertaTokenizerFast` (XLM-RoBERTa model)
- **xlm-roberta-xl** -- `XLMRobertaTokenizer` or `XLMRobertaTokenizerFast` (XLM-RoBERTa-XL model)
- **xlnet** -- `XLNetTokenizer` or `XLNetTokenizerFast` (XLNet model)
- **xlstm** -- `GPTNeoXTokenizerFast` (xLSTM model)
- **xmod** -- `XLMRobertaTokenizer` or `XLMRobertaTokenizerFast` (X-MOD model)
- **yoso** -- `AlbertTokenizer` or [AlbertTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertTokenizerFast) (YOSO model)
- **zamba** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (Zamba model)
- **zamba2** -- [LlamaTokenizer](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizer) or [LlamaTokenizerFast](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaTokenizerFast) (Zamba2 model)

Examples:

```python
>>> from transformers import AutoTokenizer

>>> # Download vocabulary from huggingface.co and cache.
>>> tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-uncased")

>>> # Download vocabulary from huggingface.co (user-uploaded) and cache.
>>> tokenizer = AutoTokenizer.from_pretrained("dbmdz/bert-base-german-cased")

>>> # If vocabulary files are in a directory (e.g. tokenizer was saved using *save_pretrained('./test/saved_model/')*)
>>> # tokenizer = AutoTokenizer.from_pretrained("./test/bert_saved_model/")

>>> # Download vocabulary from huggingface.co and define model-specific arguments
>>> tokenizer = AutoTokenizer.from_pretrained("FacebookAI/roberta-base", add_prefix_space=True)
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a predefined tokenizer hosted inside a model repo on huggingface.co. - A path to a *directory* containing vocabulary files required by the tokenizer, for instance saved using the [save_pretrained()](/docs/transformers/v4.57.1/ko/internal/tokenization_utils#transformers.PreTrainedTokenizerBase.save_pretrained) method, e.g., `./my_model_directory/`. - A path or url to a single saved vocabulary file if and only if the tokenizer only requires a single vocabulary file (like Bert or XLNet), e.g.: `./my_model_directory/vocab.txt`. (Not applicable to all derived classes)

inputs (additional positional arguments, *optional*) : Will be passed along to the Tokenizer `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : The configuration object used to determine the tokenizer class to instantiate.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download the model weights and configuration files and override the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

subfolder (`str`, *optional*) : In case the relevant files are located inside a subfolder of the model repo on huggingface.co (e.g. for facebook/rag-token-base), specify it here.

use_fast (`bool`, *optional*, defaults to `True`) : Use a [fast Rust-based tokenizer](https://huggingface.co/docs/tokenizers/index) if it is supported for a given model. If a fast tokenizer is not available for a given model, a normal Python-based tokenizer is returned instead.

tokenizer_type (`str`, *optional*) : Tokenizer type to be loaded.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

kwargs (additional keyword arguments, *optional*) : Will be passed to the Tokenizer `__init__()` method. Can be used to set special tokens like `bos_token`, `eos_token`, `unk_token`, `sep_token`, `pad_token`, `cls_token`, `mask_token`, `additional_special_tokens`. See parameters in the `__init__()` for more details.
#### register[[transformers.AutoTokenizer.register]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/tokenization_auto.py#L1190)

Register a new tokenizer in this mapping.

**Parameters:**

config_class ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The configuration corresponding to the model to register.

slow_tokenizer_class (`PretrainedTokenizer`, *optional*) : The slow tokenizer to register.

fast_tokenizer_class (`PretrainedTokenizerFast`, *optional*) : The fast tokenizer to register.

## AutoFeatureExtractor[[transformers.AutoFeatureExtractor]][[transformers.AutoFeatureExtractor]]

#### transformers.AutoFeatureExtractor[[transformers.AutoFeatureExtractor]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/feature_extraction_auto.py#L255)

This is a generic feature extractor class that will be instantiated as one of the feature extractor classes of the
library when created with the [AutoFeatureExtractor.from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoFeatureExtractor.from_pretrained) class method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_pretrainedtransformers.AutoFeatureExtractor.from_pretrainedhttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/feature_extraction_auto.py#L269[{"name": "pretrained_model_name_or_path", "val": ""}, {"name": "**kwargs", "val": ""}]- **pretrained_model_name_or_path** (`str` or `os.PathLike`) --
  This can be either:

  - a string, the *model id* of a pretrained feature_extractor hosted inside a model repo on
    huggingface.co.
  - a path to a *directory* containing a feature extractor file saved using the
    [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/feature_extractor#transformers.FeatureExtractionMixin.save_pretrained) method, e.g.,
    `./my_model_directory/`.
  - a path or url to a saved feature extractor JSON *file*, e.g.,
    `./my_model_directory/preprocessor_config.json`.
- **cache_dir** (`str` or `os.PathLike`, *optional*) --
  Path to a directory in which a downloaded pretrained model feature extractor should be cached if the
  standard cache should not be used.
- **force_download** (`bool`, *optional*, defaults to `False`) --
  Whether or not to force to (re-)download the feature extractor files and override the cached versions
  if they exist.
- **resume_download** --
  Deprecated and ignored. All downloads are now resumed by default when possible.
  Will be removed in v5 of Transformers.
- **proxies** (`dict[str, str]`, *optional*) --
  A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128',
  'http://hostname': 'foo.bar:4012'}.` The proxies are used on each request.
- **token** (`str` or *bool*, *optional*) --
  The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated
  when running `hf auth login` (stored in `~/.huggingface`).
- **revision** (`str`, *optional*, defaults to `"main"`) --
  The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a
  git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any
  identifier allowed by git.
- **return_unused_kwargs** (`bool`, *optional*, defaults to `False`) --
  If `False`, then this function returns just the final feature extractor object. If `True`, then this
  functions returns a `Tuple(feature_extractor, unused_kwargs)` where *unused_kwargs* is a dictionary
  consisting of the key/value pairs whose keys are not feature extractor attributes: i.e., the part of
  `kwargs` which has not been used to update `feature_extractor` and is otherwise ignored.
- **trust_remote_code** (`bool`, *optional*, defaults to `False`) --
  Whether or not to allow for custom models defined on the Hub in their own modeling files. This option
  should only be set to `True` for repositories you trust and in which you have read the code, as it will
  execute code present on the Hub on your local machine.
- **kwargs** (`dict[str, Any]`, *optional*) --
  The values in kwargs of any keys which are feature extractor attributes will be used to override the
  loaded values. Behavior concerning key/value pairs whose keys are *not* feature extractor attributes is
  controlled by the `return_unused_kwargs` keyword parameter.0

Instantiate one of the feature extractor classes of the library from a pretrained model vocabulary.

The feature extractor class to instantiate is selected based on the `model_type` property of the config object
(either passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's
missing, by falling back to using pattern matching on `pretrained_model_name_or_path`:

- **audio-spectrogram-transformer** -- `ASTFeatureExtractor` (Audio Spectrogram Transformer model)
- **beit** -- `BeitFeatureExtractor` (BEiT model)
- **chinese_clip** -- `ChineseCLIPFeatureExtractor` (Chinese-CLIP model)
- **clap** -- `ClapFeatureExtractor` (CLAP model)
- **clip** -- [CLIPFeatureExtractor](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPFeatureExtractor) (CLIP model)
- **clipseg** -- [ViTFeatureExtractor](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTFeatureExtractor) (CLIPSeg model)
- **clvp** -- `ClvpFeatureExtractor` (CLVP model)
- **conditional_detr** -- `ConditionalDetrFeatureExtractor` (Conditional DETR model)
- **convnext** -- `ConvNextFeatureExtractor` (ConvNeXT model)
- **cvt** -- `ConvNextFeatureExtractor` (CvT model)
- **dac** -- `DacFeatureExtractor` (DAC model)
- **data2vec-audio** -- `Wav2Vec2FeatureExtractor` (Data2VecAudio model)
- **data2vec-vision** -- `BeitFeatureExtractor` (Data2VecVision model)
- **deformable_detr** -- `DeformableDetrFeatureExtractor` (Deformable DETR model)
- **deit** -- `DeiTFeatureExtractor` (DeiT model)
- **detr** -- `DetrFeatureExtractor` (DETR model)
- **dia** -- `DiaFeatureExtractor` (Dia model)
- **dinat** -- [ViTFeatureExtractor](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTFeatureExtractor) (DiNAT model)
- **donut-swin** -- `DonutFeatureExtractor` (DonutSwin model)
- **dpt** -- `DPTFeatureExtractor` (DPT model)
- **encodec** -- `EncodecFeatureExtractor` (EnCodec model)
- **flava** -- `FlavaFeatureExtractor` (FLAVA model)
- **gemma3n** -- `Gemma3nAudioFeatureExtractor` (Gemma3nForConditionalGeneration model)
- **glpn** -- `GLPNFeatureExtractor` (GLPN model)
- **granite_speech** -- `GraniteSpeechFeatureExtractor` (GraniteSpeech model)
- **groupvit** -- [CLIPFeatureExtractor](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPFeatureExtractor) (GroupViT model)
- **hubert** -- `Wav2Vec2FeatureExtractor` (Hubert model)
- **imagegpt** -- `ImageGPTFeatureExtractor` (ImageGPT model)
- **kyutai_speech_to_text** -- `KyutaiSpeechToTextFeatureExtractor` (KyutaiSpeechToText model)
- **layoutlmv2** -- `LayoutLMv2FeatureExtractor` (LayoutLMv2 model)
- **layoutlmv3** -- `LayoutLMv3FeatureExtractor` (LayoutLMv3 model)
- **levit** -- `LevitFeatureExtractor` (LeViT model)
- **maskformer** -- `MaskFormerFeatureExtractor` (MaskFormer model)
- **mctct** -- `MCTCTFeatureExtractor` (M-CTC-T model)
- **mimi** -- `EncodecFeatureExtractor` (Mimi model)
- **mobilenet_v1** -- `MobileNetV1FeatureExtractor` (MobileNetV1 model)
- **mobilenet_v2** -- `MobileNetV2FeatureExtractor` (MobileNetV2 model)
- **mobilevit** -- `MobileViTFeatureExtractor` (MobileViT model)
- **moonshine** -- `Wav2Vec2FeatureExtractor` (Moonshine model)
- **moshi** -- `EncodecFeatureExtractor` (Moshi model)
- **nat** -- [ViTFeatureExtractor](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTFeatureExtractor) (NAT model)
- **owlvit** -- `OwlViTFeatureExtractor` (OWL-ViT model)
- **parakeet_ctc** -- `ParakeetFeatureExtractor` (Parakeet model)
- **parakeet_encoder** -- `ParakeetFeatureExtractor` (ParakeetEncoder model)
- **perceiver** -- `PerceiverFeatureExtractor` (Perceiver model)
- **phi4_multimodal** -- `Phi4MultimodalFeatureExtractor` (Phi4Multimodal model)
- **poolformer** -- `PoolFormerFeatureExtractor` (PoolFormer model)
- **pop2piano** -- `Pop2PianoFeatureExtractor` (Pop2Piano model)
- **regnet** -- `ConvNextFeatureExtractor` (RegNet model)
- **resnet** -- `ConvNextFeatureExtractor` (ResNet model)
- **seamless_m4t** -- `SeamlessM4TFeatureExtractor` (SeamlessM4T model)
- **seamless_m4t_v2** -- `SeamlessM4TFeatureExtractor` (SeamlessM4Tv2 model)
- **segformer** -- `SegformerFeatureExtractor` (SegFormer model)
- **sew** -- `Wav2Vec2FeatureExtractor` (SEW model)
- **sew-d** -- `Wav2Vec2FeatureExtractor` (SEW-D model)
- **speech_to_text** -- `Speech2TextFeatureExtractor` (Speech2Text model)
- **speecht5** -- `SpeechT5FeatureExtractor` (SpeechT5 model)
- **swiftformer** -- [ViTFeatureExtractor](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTFeatureExtractor) (SwiftFormer model)
- **swin** -- [ViTFeatureExtractor](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTFeatureExtractor) (Swin Transformer model)
- **swinv2** -- [ViTFeatureExtractor](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTFeatureExtractor) (Swin Transformer V2 model)
- **table-transformer** -- `DetrFeatureExtractor` (Table Transformer model)
- **timesformer** -- `VideoMAEFeatureExtractor` (TimeSformer model)
- **tvlt** -- `TvltFeatureExtractor` (TVLT model)
- **unispeech** -- `Wav2Vec2FeatureExtractor` (UniSpeech model)
- **unispeech-sat** -- `Wav2Vec2FeatureExtractor` (UniSpeechSat model)
- **univnet** -- `UnivNetFeatureExtractor` (UnivNet model)
- **van** -- `ConvNextFeatureExtractor` (VAN model)
- **videomae** -- `VideoMAEFeatureExtractor` (VideoMAE model)
- **vilt** -- `ViltFeatureExtractor` (ViLT model)
- **vit** -- [ViTFeatureExtractor](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTFeatureExtractor) (ViT model)
- **vit_mae** -- [ViTFeatureExtractor](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTFeatureExtractor) (ViTMAE model)
- **vit_msn** -- [ViTFeatureExtractor](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTFeatureExtractor) (ViTMSN model)
- **wav2vec2** -- `Wav2Vec2FeatureExtractor` (Wav2Vec2 model)
- **wav2vec2-bert** -- `Wav2Vec2FeatureExtractor` (Wav2Vec2-BERT model)
- **wav2vec2-conformer** -- `Wav2Vec2FeatureExtractor` (Wav2Vec2-Conformer model)
- **wavlm** -- `Wav2Vec2FeatureExtractor` (WavLM model)
- **whisper** -- [WhisperFeatureExtractor](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperFeatureExtractor) (Whisper model)
- **xclip** -- [CLIPFeatureExtractor](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPFeatureExtractor) (X-CLIP model)
- **xcodec** -- `DacFeatureExtractor` (X-CODEC model)
- **yolos** -- `YolosFeatureExtractor` (YOLOS model)

Passing `token=True` is required when you want to use a private model.

Examples:

```python
>>> from transformers import AutoFeatureExtractor

>>> # Download feature extractor from huggingface.co and cache.
>>> feature_extractor = AutoFeatureExtractor.from_pretrained("facebook/wav2vec2-base-960h")

>>> # If feature extractor files are in a directory (e.g. feature extractor was saved using *save_pretrained('./test/saved_model/')*)
>>> # feature_extractor = AutoFeatureExtractor.from_pretrained("./test/saved_model/")
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : This can be either:  - a string, the *model id* of a pretrained feature_extractor hosted inside a model repo on huggingface.co. - a path to a *directory* containing a feature extractor file saved using the [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/feature_extractor#transformers.FeatureExtractionMixin.save_pretrained) method, e.g., `./my_model_directory/`. - a path or url to a saved feature extractor JSON *file*, e.g., `./my_model_directory/preprocessor_config.json`.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model feature extractor should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force to (re-)download the feature extractor files and override the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}.` The proxies are used on each request.

token (`str` or *bool*, *optional*) : The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated when running `hf auth login` (stored in `~/.huggingface`).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

return_unused_kwargs (`bool`, *optional*, defaults to `False`) : If `False`, then this function returns just the final feature extractor object. If `True`, then this functions returns a `Tuple(feature_extractor, unused_kwargs)` where *unused_kwargs* is a dictionary consisting of the key/value pairs whose keys are not feature extractor attributes: i.e., the part of `kwargs` which has not been used to update `feature_extractor` and is otherwise ignored.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

kwargs (`dict[str, Any]`, *optional*) : The values in kwargs of any keys which are feature extractor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are *not* feature extractor attributes is controlled by the `return_unused_kwargs` keyword parameter.
#### register[[transformers.AutoFeatureExtractor.register]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/feature_extraction_auto.py#L409)

Register a new feature extractor for this class.

**Parameters:**

config_class ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The configuration corresponding to the model to register.

feature_extractor_class (`FeatureExtractorMixin`) : The feature extractor to register.

## AutoImageProcessor[[transformers.AutoImageProcessor]][[transformers.AutoImageProcessor]]

#### transformers.AutoImageProcessor[[transformers.AutoImageProcessor]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/image_processing_auto.py#L354)

This is a generic image processor class that will be instantiated as one of the image processor classes of the
library when created with the [AutoImageProcessor.from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoImageProcessor.from_pretrained) class method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_pretrainedtransformers.AutoImageProcessor.from_pretrainedhttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/image_processing_auto.py#L368[{"name": "pretrained_model_name_or_path", "val": ""}, {"name": "*inputs", "val": ""}, {"name": "**kwargs", "val": ""}]- **pretrained_model_name_or_path** (`str` or `os.PathLike`) --
  This can be either:

  - a string, the *model id* of a pretrained image_processor hosted inside a model repo on
    huggingface.co.
  - a path to a *directory* containing a image processor file saved using the
    [save_pretrained()](/docs/transformers/v4.57.1/ko/internal/image_processing_utils#transformers.ImageProcessingMixin.save_pretrained) method, e.g.,
    `./my_model_directory/`.
  - a path or url to a saved image processor JSON *file*, e.g.,
    `./my_model_directory/preprocessor_config.json`.
- **cache_dir** (`str` or `os.PathLike`, *optional*) --
  Path to a directory in which a downloaded pretrained model image processor should be cached if the
  standard cache should not be used.
- **force_download** (`bool`, *optional*, defaults to `False`) --
  Whether or not to force to (re-)download the image processor files and override the cached versions if
  they exist.
- **resume_download** --
  Deprecated and ignored. All downloads are now resumed by default when possible.
  Will be removed in v5 of Transformers.
- **proxies** (`dict[str, str]`, *optional*) --
  A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128',
  'http://hostname': 'foo.bar:4012'}.` The proxies are used on each request.
- **token** (`str` or *bool*, *optional*) --
  The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated
  when running `hf auth login` (stored in `~/.huggingface`).
- **revision** (`str`, *optional*, defaults to `"main"`) --
  The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a
  git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any
  identifier allowed by git.
- **use_fast** (`bool`, *optional*, defaults to `False`) --
  Use a fast torchvision-base image processor if it is supported for a given model.
  If a fast image processor is not available for a given model, a normal numpy-based image processor
  is returned instead.
- **return_unused_kwargs** (`bool`, *optional*, defaults to `False`) --
  If `False`, then this function returns just the final image processor object. If `True`, then this
  functions returns a `Tuple(image_processor, unused_kwargs)` where *unused_kwargs* is a dictionary
  consisting of the key/value pairs whose keys are not image processor attributes: i.e., the part of
  `kwargs` which has not been used to update `image_processor` and is otherwise ignored.
- **trust_remote_code** (`bool`, *optional*, defaults to `False`) --
  Whether or not to allow for custom models defined on the Hub in their own modeling files. This option
  should only be set to `True` for repositories you trust and in which you have read the code, as it will
  execute code present on the Hub on your local machine.
- **image_processor_filename** (`str`, *optional*, defaults to `"config.json"`) --
  The name of the file in the model directory to use for the image processor config.
- **kwargs** (`dict[str, Any]`, *optional*) --
  The values in kwargs of any keys which are image processor attributes will be used to override the
  loaded values. Behavior concerning key/value pairs whose keys are *not* image processor attributes is
  controlled by the `return_unused_kwargs` keyword parameter.0

Instantiate one of the image processor classes of the library from a pretrained model vocabulary.

The image processor class to instantiate is selected based on the `model_type` property of the config object
(either passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's
missing, by falling back to using pattern matching on `pretrained_model_name_or_path`:

- **aimv2** -- [CLIPImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPImageProcessor) or `CLIPImageProcessorFast` (AIMv2 model)
- **aimv2_vision_model** -- [CLIPImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPImageProcessor) or `CLIPImageProcessorFast` (Aimv2VisionModel model)
- **align** -- `EfficientNetImageProcessor` or `EfficientNetImageProcessorFast` (ALIGN model)
- **aria** -- `AriaImageProcessor` (Aria model)
- **beit** -- `BeitImageProcessor` or `BeitImageProcessorFast` (BEiT model)
- **bit** -- `BitImageProcessor` or `BitImageProcessorFast` (BiT model)
- **blip** -- [BlipImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.BlipImageProcessor) or `BlipImageProcessorFast` (BLIP model)
- **blip-2** -- [BlipImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.BlipImageProcessor) or `BlipImageProcessorFast` (BLIP-2 model)
- **bridgetower** -- `BridgeTowerImageProcessor` or `BridgeTowerImageProcessorFast` (BridgeTower model)
- **chameleon** -- [ChameleonImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/chameleon#transformers.ChameleonImageProcessor) or `ChameleonImageProcessorFast` (Chameleon model)
- **chinese_clip** -- `ChineseCLIPImageProcessor` or `ChineseCLIPImageProcessorFast` (Chinese-CLIP model)
- **clip** -- [CLIPImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPImageProcessor) or `CLIPImageProcessorFast` (CLIP model)
- **clipseg** -- [ViTImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTImageProcessor) or [ViTImageProcessorFast](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTImageProcessorFast) (CLIPSeg model)
- **cohere2_vision** -- `Cohere2VisionImageProcessorFast` (Cohere2Vision model)
- **conditional_detr** -- `ConditionalDetrImageProcessor` or `ConditionalDetrImageProcessorFast` (Conditional DETR model)
- **convnext** -- `ConvNextImageProcessor` or `ConvNextImageProcessorFast` (ConvNeXT model)
- **convnextv2** -- `ConvNextImageProcessor` or `ConvNextImageProcessorFast` (ConvNeXTV2 model)
- **cvt** -- `ConvNextImageProcessor` or `ConvNextImageProcessorFast` (CvT model)
- **data2vec-vision** -- `BeitImageProcessor` or `BeitImageProcessorFast` (Data2VecVision model)
- **deepseek_vl** -- `DeepseekVLImageProcessor` or `DeepseekVLImageProcessorFast` (DeepseekVL model)
- **deepseek_vl_hybrid** -- `DeepseekVLHybridImageProcessor` or `DeepseekVLHybridImageProcessorFast` (DeepseekVLHybrid model)
- **deformable_detr** -- `DeformableDetrImageProcessor` or `DeformableDetrImageProcessorFast` (Deformable DETR model)
- **deit** -- `DeiTImageProcessor` or `DeiTImageProcessorFast` (DeiT model)
- **depth_anything** -- `DPTImageProcessor` or `DPTImageProcessorFast` (Depth Anything model)
- **depth_pro** -- `DepthProImageProcessor` or `DepthProImageProcessorFast` (DepthPro model)
- **deta** -- `DetaImageProcessor` (DETA model)
- **detr** -- `DetrImageProcessor` or `DetrImageProcessorFast` (DETR model)
- **dinat** -- [ViTImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTImageProcessor) or [ViTImageProcessorFast](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTImageProcessorFast) (DiNAT model)
- **dinov2** -- `BitImageProcessor` or `BitImageProcessorFast` (DINOv2 model)
- **dinov3_vit** -- `DINOv3ViTImageProcessorFast` (DINOv3 ViT model)
- **donut-swin** -- `DonutImageProcessor` or `DonutImageProcessorFast` (DonutSwin model)
- **dpt** -- `DPTImageProcessor` or `DPTImageProcessorFast` (DPT model)
- **edgetam** -- `Sam2ImageProcessorFast` (EdgeTAM model)
- **efficientformer** -- `EfficientFormerImageProcessor` (EfficientFormer model)
- **efficientloftr** -- `EfficientLoFTRImageProcessor` or `EfficientLoFTRImageProcessorFast` (EfficientLoFTR model)
- **efficientnet** -- `EfficientNetImageProcessor` or `EfficientNetImageProcessorFast` (EfficientNet model)
- **eomt** -- `EomtImageProcessor` or `EomtImageProcessorFast` (EoMT model)
- **flava** -- `FlavaImageProcessor` or `FlavaImageProcessorFast` (FLAVA model)
- **focalnet** -- `BitImageProcessor` or `BitImageProcessorFast` (FocalNet model)
- **fuyu** -- `FuyuImageProcessor` (Fuyu model)
- **gemma3** -- [Gemma3ImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3ImageProcessor) or [Gemma3ImageProcessorFast](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3ImageProcessorFast) (Gemma3ForConditionalGeneration model)
- **gemma3n** -- [SiglipImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipImageProcessor) or [SiglipImageProcessorFast](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipImageProcessorFast) (Gemma3nForConditionalGeneration model)
- **git** -- [CLIPImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPImageProcessor) or `CLIPImageProcessorFast` (GIT model)
- **glm4v** -- `Glm4vImageProcessor` or `Glm4vImageProcessorFast` (GLM4V model)
- **glpn** -- `GLPNImageProcessor` (GLPN model)
- **got_ocr2** -- `GotOcr2ImageProcessor` or `GotOcr2ImageProcessorFast` (GOT-OCR2 model)
- **grounding-dino** -- [GroundingDinoImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/grounding-dino#transformers.GroundingDinoImageProcessor) or [GroundingDinoImageProcessorFast](/docs/transformers/v4.57.1/ko/model_doc/grounding-dino#transformers.GroundingDinoImageProcessorFast) (Grounding DINO model)
- **groupvit** -- [CLIPImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPImageProcessor) or `CLIPImageProcessorFast` (GroupViT model)
- **hiera** -- `BitImageProcessor` or `BitImageProcessorFast` (Hiera model)
- **idefics** -- `IdeficsImageProcessor` (IDEFICS model)
- **idefics2** -- `Idefics2ImageProcessor` or `Idefics2ImageProcessorFast` (Idefics2 model)
- **idefics3** -- `Idefics3ImageProcessor` or `Idefics3ImageProcessorFast` (Idefics3 model)
- **ijepa** -- [ViTImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTImageProcessor) or [ViTImageProcessorFast](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTImageProcessorFast) (I-JEPA model)
- **imagegpt** -- `ImageGPTImageProcessor` or `ImageGPTImageProcessorFast` (ImageGPT model)
- **instructblip** -- [BlipImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.BlipImageProcessor) or `BlipImageProcessorFast` (InstructBLIP model)
- **instructblipvideo** -- `InstructBlipVideoImageProcessor` (InstructBlipVideo model)
- **janus** -- `JanusImageProcessor` or `JanusImageProcessorFast` (Janus model)
- **kosmos-2** -- [CLIPImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPImageProcessor) or `CLIPImageProcessorFast` (KOSMOS-2 model)
- **kosmos-2.5** -- `Kosmos2_5ImageProcessor` or `Kosmos2_5ImageProcessorFast` (KOSMOS-2.5 model)
- **layoutlmv2** -- `LayoutLMv2ImageProcessor` or `LayoutLMv2ImageProcessorFast` (LayoutLMv2 model)
- **layoutlmv3** -- `LayoutLMv3ImageProcessor` or `LayoutLMv3ImageProcessorFast` (LayoutLMv3 model)
- **levit** -- `LevitImageProcessor` or `LevitImageProcessorFast` (LeViT model)
- **lfm2_vl** -- `Lfm2VlImageProcessorFast` (Lfm2Vl model)
- **lightglue** -- `LightGlueImageProcessor` (LightGlue model)
- **llama4** -- `Llama4ImageProcessor` or `Llama4ImageProcessorFast` (Llama4 model)
- **llava** -- `LlavaImageProcessor` or `LlavaImageProcessorFast` (LLaVa model)
- **llava_next** -- `LlavaNextImageProcessor` or `LlavaNextImageProcessorFast` (LLaVA-NeXT model)
- **llava_next_video** -- `LlavaNextVideoImageProcessor` (LLaVa-NeXT-Video model)
- **llava_onevision** -- `LlavaOnevisionImageProcessor` or `LlavaOnevisionImageProcessorFast` (LLaVA-Onevision model)
- **mask2former** -- `Mask2FormerImageProcessor` or `Mask2FormerImageProcessorFast` (Mask2Former model)
- **maskformer** -- `MaskFormerImageProcessor` or `MaskFormerImageProcessorFast` (MaskFormer model)
- **metaclip_2** -- [CLIPImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPImageProcessor) or `CLIPImageProcessorFast` (MetaCLIP 2 model)
- **mgp-str** -- [ViTImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTImageProcessor) or [ViTImageProcessorFast](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTImageProcessorFast) (MGP-STR model)
- **mistral3** -- `PixtralImageProcessor` or `PixtralImageProcessorFast` (Mistral3 model)
- **mlcd** -- [CLIPImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPImageProcessor) or `CLIPImageProcessorFast` (MLCD model)
- **mllama** -- `MllamaImageProcessor` (Mllama model)
- **mm-grounding-dino** -- [GroundingDinoImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/grounding-dino#transformers.GroundingDinoImageProcessor) or [GroundingDinoImageProcessorFast](/docs/transformers/v4.57.1/ko/model_doc/grounding-dino#transformers.GroundingDinoImageProcessorFast) (MM Grounding DINO model)
- **mobilenet_v1** -- `MobileNetV1ImageProcessor` or `MobileNetV1ImageProcessorFast` (MobileNetV1 model)
- **mobilenet_v2** -- `MobileNetV2ImageProcessor` or `MobileNetV2ImageProcessorFast` (MobileNetV2 model)
- **mobilevit** -- `MobileViTImageProcessor` or `MobileViTImageProcessorFast` (MobileViT model)
- **mobilevitv2** -- `MobileViTImageProcessor` or `MobileViTImageProcessorFast` (MobileViTV2 model)
- **nat** -- [ViTImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTImageProcessor) or [ViTImageProcessorFast](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTImageProcessorFast) (NAT model)
- **nougat** -- `NougatImageProcessor` or `NougatImageProcessorFast` (Nougat model)
- **oneformer** -- `OneFormerImageProcessor` or `OneFormerImageProcessorFast` (OneFormer model)
- **ovis2** -- `Ovis2ImageProcessor` or `Ovis2ImageProcessorFast` (Ovis2 model)
- **owlv2** -- `Owlv2ImageProcessor` or `Owlv2ImageProcessorFast` (OWLv2 model)
- **owlvit** -- `OwlViTImageProcessor` or `OwlViTImageProcessorFast` (OWL-ViT model)
- **paligemma** -- [SiglipImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipImageProcessor) or [SiglipImageProcessorFast](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipImageProcessorFast) (PaliGemma model)
- **perceiver** -- `PerceiverImageProcessor` or `PerceiverImageProcessorFast` (Perceiver model)
- **perception_lm** -- `PerceptionLMImageProcessorFast` (PerceptionLM model)
- **phi4_multimodal** -- `Phi4MultimodalImageProcessorFast` (Phi4Multimodal model)
- **pix2struct** -- `Pix2StructImageProcessor` (Pix2Struct model)
- **pixtral** -- `PixtralImageProcessor` or `PixtralImageProcessorFast` (Pixtral model)
- **poolformer** -- `PoolFormerImageProcessor` or `PoolFormerImageProcessorFast` (PoolFormer model)
- **prompt_depth_anything** -- `PromptDepthAnythingImageProcessor` or `PromptDepthAnythingImageProcessorFast` (PromptDepthAnything model)
- **pvt** -- `PvtImageProcessor` or `PvtImageProcessorFast` (PVT model)
- **pvt_v2** -- `PvtImageProcessor` or `PvtImageProcessorFast` (PVTv2 model)
- **qwen2_5_vl** -- [Qwen2VLImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/qwen2_vl#transformers.Qwen2VLImageProcessor) or [Qwen2VLImageProcessorFast](/docs/transformers/v4.57.1/ko/model_doc/qwen2_vl#transformers.Qwen2VLImageProcessorFast) (Qwen2_5_VL model)
- **qwen2_vl** -- [Qwen2VLImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/qwen2_vl#transformers.Qwen2VLImageProcessor) or [Qwen2VLImageProcessorFast](/docs/transformers/v4.57.1/ko/model_doc/qwen2_vl#transformers.Qwen2VLImageProcessorFast) (Qwen2VL model)
- **qwen3_vl** -- [Qwen2VLImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/qwen2_vl#transformers.Qwen2VLImageProcessor) or [Qwen2VLImageProcessorFast](/docs/transformers/v4.57.1/ko/model_doc/qwen2_vl#transformers.Qwen2VLImageProcessorFast) (Qwen3VL model)
- **regnet** -- `ConvNextImageProcessor` or `ConvNextImageProcessorFast` (RegNet model)
- **resnet** -- `ConvNextImageProcessor` or `ConvNextImageProcessorFast` (ResNet model)
- **rt_detr** -- `RTDetrImageProcessor` or `RTDetrImageProcessorFast` (RT-DETR model)
- **sam** -- `SamImageProcessor` or `SamImageProcessorFast` (SAM model)
- **sam2** -- `Sam2ImageProcessorFast` (SAM2 model)
- **sam_hq** -- `SamImageProcessor` or `SamImageProcessorFast` (SAM-HQ model)
- **segformer** -- `SegformerImageProcessor` or `SegformerImageProcessorFast` (SegFormer model)
- **seggpt** -- `SegGptImageProcessor` (SegGPT model)
- **shieldgemma2** -- [Gemma3ImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3ImageProcessor) or [Gemma3ImageProcessorFast](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3ImageProcessorFast) (Shieldgemma2 model)
- **siglip** -- [SiglipImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipImageProcessor) or [SiglipImageProcessorFast](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipImageProcessorFast) (SigLIP model)
- **siglip2** -- `Siglip2ImageProcessor` or `Siglip2ImageProcessorFast` (SigLIP2 model)
- **smolvlm** -- [SmolVLMImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/smolvlm#transformers.SmolVLMImageProcessor) or [SmolVLMImageProcessorFast](/docs/transformers/v4.57.1/ko/model_doc/smolvlm#transformers.SmolVLMImageProcessorFast) (SmolVLM model)
- **superglue** -- `SuperGlueImageProcessor` (SuperGlue model)
- **superpoint** -- `SuperPointImageProcessor` or `SuperPointImageProcessorFast` (SuperPoint model)
- **swiftformer** -- [ViTImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTImageProcessor) or [ViTImageProcessorFast](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTImageProcessorFast) (SwiftFormer model)
- **swin** -- [ViTImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTImageProcessor) or [ViTImageProcessorFast](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTImageProcessorFast) (Swin Transformer model)
- **swin2sr** -- [Swin2SRImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/swin2sr#transformers.Swin2SRImageProcessor) or `Swin2SRImageProcessorFast` (Swin2SR model)
- **swinv2** -- [ViTImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTImageProcessor) or [ViTImageProcessorFast](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTImageProcessorFast) (Swin Transformer V2 model)
- **table-transformer** -- `DetrImageProcessor` or `DetrImageProcessorFast` (Table Transformer model)
- **textnet** -- `TextNetImageProcessor` or `TextNetImageProcessorFast` (TextNet model)
- **timesformer** -- `VideoMAEImageProcessor` (TimeSformer model)
- **timm_wrapper** -- `TimmWrapperImageProcessor` (TimmWrapperModel model)
- **tvlt** -- `TvltImageProcessor` (TVLT model)
- **tvp** -- [TvpImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/tvp#transformers.TvpImageProcessor) or `TvpImageProcessorFast` (TVP model)
- **udop** -- `LayoutLMv3ImageProcessor` or `LayoutLMv3ImageProcessorFast` (UDOP model)
- **upernet** -- `SegformerImageProcessor` or `SegformerImageProcessorFast` (UPerNet model)
- **van** -- `ConvNextImageProcessor` or `ConvNextImageProcessorFast` (VAN model)
- **videomae** -- `VideoMAEImageProcessor` (VideoMAE model)
- **vilt** -- `ViltImageProcessor` or `ViltImageProcessorFast` (ViLT model)
- **vipllava** -- [CLIPImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPImageProcessor) or `CLIPImageProcessorFast` (VipLlava model)
- **vit** -- [ViTImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTImageProcessor) or [ViTImageProcessorFast](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTImageProcessorFast) (ViT model)
- **vit_hybrid** -- `ViTHybridImageProcessor` (ViT Hybrid model)
- **vit_mae** -- [ViTImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTImageProcessor) or [ViTImageProcessorFast](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTImageProcessorFast) (ViTMAE model)
- **vit_msn** -- [ViTImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTImageProcessor) or [ViTImageProcessorFast](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTImageProcessorFast) (ViTMSN model)
- **vitmatte** -- `VitMatteImageProcessor` or `VitMatteImageProcessorFast` (ViTMatte model)
- **xclip** -- [CLIPImageProcessor](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPImageProcessor) or `CLIPImageProcessorFast` (X-CLIP model)
- **yolos** -- `YolosImageProcessor` or `YolosImageProcessorFast` (YOLOS model)
- **zoedepth** -- `ZoeDepthImageProcessor` or `ZoeDepthImageProcessorFast` (ZoeDepth model)

Passing `token=True` is required when you want to use a private model.

Examples:

```python
>>> from transformers import AutoImageProcessor

>>> # Download image processor from huggingface.co and cache.
>>> image_processor = AutoImageProcessor.from_pretrained("google/vit-base-patch16-224-in21k")

>>> # If image processor files are in a directory (e.g. image processor was saved using *save_pretrained('./test/saved_model/')*)
>>> # image_processor = AutoImageProcessor.from_pretrained("./test/saved_model/")
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : This can be either:  - a string, the *model id* of a pretrained image_processor hosted inside a model repo on huggingface.co. - a path to a *directory* containing a image processor file saved using the [save_pretrained()](/docs/transformers/v4.57.1/ko/internal/image_processing_utils#transformers.ImageProcessingMixin.save_pretrained) method, e.g., `./my_model_directory/`. - a path or url to a saved image processor JSON *file*, e.g., `./my_model_directory/preprocessor_config.json`.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model image processor should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force to (re-)download the image processor files and override the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}.` The proxies are used on each request.

token (`str` or *bool*, *optional*) : The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated when running `hf auth login` (stored in `~/.huggingface`).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

use_fast (`bool`, *optional*, defaults to `False`) : Use a fast torchvision-base image processor if it is supported for a given model. If a fast image processor is not available for a given model, a normal numpy-based image processor is returned instead.

return_unused_kwargs (`bool`, *optional*, defaults to `False`) : If `False`, then this function returns just the final image processor object. If `True`, then this functions returns a `Tuple(image_processor, unused_kwargs)` where *unused_kwargs* is a dictionary consisting of the key/value pairs whose keys are not image processor attributes: i.e., the part of `kwargs` which has not been used to update `image_processor` and is otherwise ignored.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

image_processor_filename (`str`, *optional*, defaults to `"config.json"`) : The name of the file in the model directory to use for the image processor config.

kwargs (`dict[str, Any]`, *optional*) : The values in kwargs of any keys which are image processor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are *not* image processor attributes is controlled by the `return_unused_kwargs` keyword parameter.
#### register[[transformers.AutoImageProcessor.register]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/image_processing_auto.py#L628)

Register a new image processor for this class.

**Parameters:**

config_class ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The configuration corresponding to the model to register.

image_processor_class ([ImageProcessingMixin](/docs/transformers/v4.57.1/ko/internal/image_processing_utils#transformers.ImageProcessingMixin)) : The image processor to register.

## AutoProcessor[[transformers.AutoProcessor]][[transformers.AutoProcessor]]

#### transformers.AutoProcessor[[transformers.AutoProcessor]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/processing_auto.py#L188)

This is a generic processor class that will be instantiated as one of the processor classes of the library when
created with the [AutoProcessor.from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoProcessor.from_pretrained) class method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_pretrainedtransformers.AutoProcessor.from_pretrainedhttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/processing_auto.py#L202[{"name": "pretrained_model_name_or_path", "val": ""}, {"name": "**kwargs", "val": ""}]- **pretrained_model_name_or_path** (`str` or `os.PathLike`) --
  This can be either:

  - a string, the *model id* of a pretrained feature_extractor hosted inside a model repo on
    huggingface.co.
  - a path to a *directory* containing a processor files saved using the `save_pretrained()` method,
    e.g., `./my_model_directory/`.
- **cache_dir** (`str` or `os.PathLike`, *optional*) --
  Path to a directory in which a downloaded pretrained model feature extractor should be cached if the
  standard cache should not be used.
- **force_download** (`bool`, *optional*, defaults to `False`) --
  Whether or not to force to (re-)download the feature extractor files and override the cached versions
  if they exist.
- **resume_download** --
  Deprecated and ignored. All downloads are now resumed by default when possible.
  Will be removed in v5 of Transformers.
- **proxies** (`dict[str, str]`, *optional*) --
  A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128',
  'http://hostname': 'foo.bar:4012'}.` The proxies are used on each request.
- **token** (`str` or *bool*, *optional*) --
  The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated
  when running `hf auth login` (stored in `~/.huggingface`).
- **revision** (`str`, *optional*, defaults to `"main"`) --
  The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a
  git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any
  identifier allowed by git.
- **return_unused_kwargs** (`bool`, *optional*, defaults to `False`) --
  If `False`, then this function returns just the final feature extractor object. If `True`, then this
  functions returns a `Tuple(feature_extractor, unused_kwargs)` where *unused_kwargs* is a dictionary
  consisting of the key/value pairs whose keys are not feature extractor attributes: i.e., the part of
  `kwargs` which has not been used to update `feature_extractor` and is otherwise ignored.
- **trust_remote_code** (`bool`, *optional*, defaults to `False`) --
  Whether or not to allow for custom models defined on the Hub in their own modeling files. This option
  should only be set to `True` for repositories you trust and in which you have read the code, as it will
  execute code present on the Hub on your local machine.
- **kwargs** (`dict[str, Any]`, *optional*) --
  The values in kwargs of any keys which are feature extractor attributes will be used to override the
  loaded values. Behavior concerning key/value pairs whose keys are *not* feature extractor attributes is
  controlled by the `return_unused_kwargs` keyword parameter.0

Instantiate one of the processor classes of the library from a pretrained model vocabulary.

The processor class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible):

- **aimv2** -- [CLIPProcessor](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPProcessor) (AIMv2 model)
- **align** -- `AlignProcessor` (ALIGN model)
- **altclip** -- [AltCLIPProcessor](/docs/transformers/v4.57.1/ko/model_doc/altclip#transformers.AltCLIPProcessor) (AltCLIP model)
- **aria** -- `AriaProcessor` (Aria model)
- **aya_vision** -- `AyaVisionProcessor` (AyaVision model)
- **bark** -- `BarkProcessor` (Bark model)
- **blip** -- [BlipProcessor](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.BlipProcessor) (BLIP model)
- **blip-2** -- [Blip2Processor](/docs/transformers/v4.57.1/ko/model_doc/blip-2#transformers.Blip2Processor) (BLIP-2 model)
- **bridgetower** -- `BridgeTowerProcessor` (BridgeTower model)
- **chameleon** -- [ChameleonProcessor](/docs/transformers/v4.57.1/ko/model_doc/chameleon#transformers.ChameleonProcessor) (Chameleon model)
- **chinese_clip** -- `ChineseCLIPProcessor` (Chinese-CLIP model)
- **clap** -- `ClapProcessor` (CLAP model)
- **clip** -- [CLIPProcessor](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPProcessor) (CLIP model)
- **clipseg** -- [CLIPSegProcessor](/docs/transformers/v4.57.1/ko/model_doc/clipseg#transformers.CLIPSegProcessor) (CLIPSeg model)
- **clvp** -- `ClvpProcessor` (CLVP model)
- **cohere2_vision** -- `Cohere2VisionProcessor` (Cohere2Vision model)
- **colpali** -- `ColPaliProcessor` (ColPali model)
- **colqwen2** -- `ColQwen2Processor` (ColQwen2 model)
- **deepseek_vl** -- `DeepseekVLProcessor` (DeepseekVL model)
- **deepseek_vl_hybrid** -- `DeepseekVLHybridProcessor` (DeepseekVLHybrid model)
- **dia** -- `DiaProcessor` (Dia model)
- **edgetam** -- `Sam2Processor` (EdgeTAM model)
- **emu3** -- `Emu3Processor` (Emu3 model)
- **evolla** -- `EvollaProcessor` (Evolla model)
- **flava** -- `FlavaProcessor` (FLAVA model)
- **florence2** -- `Florence2Processor` (Florence2 model)
- **fuyu** -- `FuyuProcessor` (Fuyu model)
- **gemma3** -- [Gemma3Processor](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3Processor) (Gemma3ForConditionalGeneration model)
- **gemma3n** -- `Gemma3nProcessor` (Gemma3nForConditionalGeneration model)
- **git** -- `GitProcessor` (GIT model)
- **glm4v** -- `Glm4vProcessor` (GLM4V model)
- **glm4v_moe** -- `Glm4vProcessor` (GLM4VMOE model)
- **got_ocr2** -- `GotOcr2Processor` (GOT-OCR2 model)
- **granite_speech** -- `GraniteSpeechProcessor` (GraniteSpeech model)
- **grounding-dino** -- [GroundingDinoProcessor](/docs/transformers/v4.57.1/ko/model_doc/grounding-dino#transformers.GroundingDinoProcessor) (Grounding DINO model)
- **groupvit** -- [CLIPProcessor](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPProcessor) (GroupViT model)
- **hubert** -- `Wav2Vec2Processor` (Hubert model)
- **idefics** -- `IdeficsProcessor` (IDEFICS model)
- **idefics2** -- `Idefics2Processor` (Idefics2 model)
- **idefics3** -- `Idefics3Processor` (Idefics3 model)
- **instructblip** -- `InstructBlipProcessor` (InstructBLIP model)
- **instructblipvideo** -- `InstructBlipVideoProcessor` (InstructBlipVideo model)
- **internvl** -- `InternVLProcessor` (InternVL model)
- **janus** -- `JanusProcessor` (Janus model)
- **kosmos-2** -- `Kosmos2Processor` (KOSMOS-2 model)
- **kosmos-2.5** -- `Kosmos2_5Processor` (KOSMOS-2.5 model)
- **kyutai_speech_to_text** -- `KyutaiSpeechToTextProcessor` (KyutaiSpeechToText model)
- **layoutlmv2** -- `LayoutLMv2Processor` (LayoutLMv2 model)
- **layoutlmv3** -- `LayoutLMv3Processor` (LayoutLMv3 model)
- **lfm2_vl** -- `Lfm2VlProcessor` (Lfm2Vl model)
- **llama4** -- `Llama4Processor` (Llama4 model)
- **llava** -- `LlavaProcessor` (LLaVa model)
- **llava_next** -- `LlavaNextProcessor` (LLaVA-NeXT model)
- **llava_next_video** -- `LlavaNextVideoProcessor` (LLaVa-NeXT-Video model)
- **llava_onevision** -- `LlavaOnevisionProcessor` (LLaVA-Onevision model)
- **markuplm** -- `MarkupLMProcessor` (MarkupLM model)
- **mctct** -- `MCTCTProcessor` (M-CTC-T model)
- **metaclip_2** -- [CLIPProcessor](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPProcessor) (MetaCLIP 2 model)
- **mgp-str** -- `MgpstrProcessor` (MGP-STR model)
- **mistral3** -- `PixtralProcessor` (Mistral3 model)
- **mllama** -- `MllamaProcessor` (Mllama model)
- **mm-grounding-dino** -- [GroundingDinoProcessor](/docs/transformers/v4.57.1/ko/model_doc/grounding-dino#transformers.GroundingDinoProcessor) (MM Grounding DINO model)
- **moonshine** -- `Wav2Vec2Processor` (Moonshine model)
- **oneformer** -- `OneFormerProcessor` (OneFormer model)
- **ovis2** -- `Ovis2Processor` (Ovis2 model)
- **owlv2** -- `Owlv2Processor` (OWLv2 model)
- **owlvit** -- `OwlViTProcessor` (OWL-ViT model)
- **paligemma** -- [PaliGemmaProcessor](/docs/transformers/v4.57.1/ko/model_doc/paligemma#transformers.PaliGemmaProcessor) (PaliGemma model)
- **perception_lm** -- `PerceptionLMProcessor` (PerceptionLM model)
- **phi4_multimodal** -- `Phi4MultimodalProcessor` (Phi4Multimodal model)
- **pix2struct** -- `Pix2StructProcessor` (Pix2Struct model)
- **pixtral** -- `PixtralProcessor` (Pixtral model)
- **pop2piano** -- `Pop2PianoProcessor` (Pop2Piano model)
- **qwen2_5_omni** -- `Qwen2_5OmniProcessor` (Qwen2_5Omni model)
- **qwen2_5_vl** -- `Qwen2_5_VLProcessor` (Qwen2_5_VL model)
- **qwen2_audio** -- `Qwen2AudioProcessor` (Qwen2Audio model)
- **qwen2_vl** -- [Qwen2VLProcessor](/docs/transformers/v4.57.1/ko/model_doc/qwen2_vl#transformers.Qwen2VLProcessor) (Qwen2VL model)
- **qwen3_omni_moe** -- `Qwen3OmniMoeProcessor` (Qwen3OmniMoE model)
- **qwen3_vl** -- `Qwen3VLProcessor` (Qwen3VL model)
- **qwen3_vl_moe** -- `Qwen3VLProcessor` (Qwen3VLMoe model)
- **sam** -- `SamProcessor` (SAM model)
- **sam2** -- `Sam2Processor` (SAM2 model)
- **sam_hq** -- `SamHQProcessor` (SAM-HQ model)
- **seamless_m4t** -- `SeamlessM4TProcessor` (SeamlessM4T model)
- **sew** -- `Wav2Vec2Processor` (SEW model)
- **sew-d** -- `Wav2Vec2Processor` (SEW-D model)
- **shieldgemma2** -- `ShieldGemma2Processor` (Shieldgemma2 model)
- **siglip** -- [SiglipProcessor](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipProcessor) (SigLIP model)
- **siglip2** -- `Siglip2Processor` (SigLIP2 model)
- **smolvlm** -- [SmolVLMProcessor](/docs/transformers/v4.57.1/ko/model_doc/smolvlm#transformers.SmolVLMProcessor) (SmolVLM model)
- **speech_to_text** -- `Speech2TextProcessor` (Speech2Text model)
- **speech_to_text_2** -- `Speech2Text2Processor` (Speech2Text2 model)
- **speecht5** -- `SpeechT5Processor` (SpeechT5 model)
- **trocr** -- `TrOCRProcessor` (TrOCR model)
- **tvlt** -- `TvltProcessor` (TVLT model)
- **tvp** -- [TvpProcessor](/docs/transformers/v4.57.1/ko/model_doc/tvp#transformers.TvpProcessor) (TVP model)
- **udop** -- `UdopProcessor` (UDOP model)
- **unispeech** -- `Wav2Vec2Processor` (UniSpeech model)
- **unispeech-sat** -- `Wav2Vec2Processor` (UniSpeechSat model)
- **video_llava** -- `VideoLlavaProcessor` (VideoLlava model)
- **vilt** -- `ViltProcessor` (ViLT model)
- **vipllava** -- `LlavaProcessor` (VipLlava model)
- **vision-text-dual-encoder** -- `VisionTextDualEncoderProcessor` (VisionTextDualEncoder model)
- **voxtral** -- `VoxtralProcessor` (Voxtral model)
- **wav2vec2** -- `Wav2Vec2Processor` (Wav2Vec2 model)
- **wav2vec2-bert** -- `Wav2Vec2Processor` (Wav2Vec2-BERT model)
- **wav2vec2-conformer** -- `Wav2Vec2Processor` (Wav2Vec2-Conformer model)
- **wavlm** -- `Wav2Vec2Processor` (WavLM model)
- **whisper** -- [WhisperProcessor](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperProcessor) (Whisper model)
- **xclip** -- [XCLIPProcessor](/docs/transformers/v4.57.1/ko/model_doc/xclip#transformers.XCLIPProcessor) (X-CLIP model)

Passing `token=True` is required when you want to use a private model.

Examples:

```python
>>> from transformers import AutoProcessor

>>> # Download processor from huggingface.co and cache.
>>> processor = AutoProcessor.from_pretrained("facebook/wav2vec2-base-960h")

>>> # If processor files are in a directory (e.g. processor was saved using *save_pretrained('./test/saved_model/')*)
>>> # processor = AutoProcessor.from_pretrained("./test/saved_model/")
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : This can be either:  - a string, the *model id* of a pretrained feature_extractor hosted inside a model repo on huggingface.co. - a path to a *directory* containing a processor files saved using the `save_pretrained()` method, e.g., `./my_model_directory/`.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model feature extractor should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force to (re-)download the feature extractor files and override the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}.` The proxies are used on each request.

token (`str` or *bool*, *optional*) : The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated when running `hf auth login` (stored in `~/.huggingface`).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

return_unused_kwargs (`bool`, *optional*, defaults to `False`) : If `False`, then this function returns just the final feature extractor object. If `True`, then this functions returns a `Tuple(feature_extractor, unused_kwargs)` where *unused_kwargs* is a dictionary consisting of the key/value pairs whose keys are not feature extractor attributes: i.e., the part of `kwargs` which has not been used to update `feature_extractor` and is otherwise ignored.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

kwargs (`dict[str, Any]`, *optional*) : The values in kwargs of any keys which are feature extractor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are *not* feature extractor attributes is controlled by the `return_unused_kwargs` keyword parameter.
#### register[[transformers.AutoProcessor.register]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/processing_auto.py#L430)

Register a new processor for this class.

**Parameters:**

config_class ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The configuration corresponding to the model to register.

processor_class ([ProcessorMixin](/docs/transformers/v4.57.1/ko/main_classes/processors#transformers.ProcessorMixin)) : The processor to register.

## 일반적인 모델 클래스[[generic-model-classes]]

다음 자동 클래스들은 특정 헤드 없이 기본 모델 클래스를 인스턴스화하는 데 사용할 수 있습니다.

### AutoModel[[transformers.AutoModel]][[transformers.AutoModel]]

#### transformers.AutoModel[[transformers.AutoModel]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L1940)

This is a generic model class that will be instantiated as one of the base model classes of the library when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModel.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `ASTConfig` configuration class: `ASTModel` (Audio Spectrogram Transformer model)
  - `Aimv2Config` configuration class: `Aimv2Model` (AIMv2 model)
  - `Aimv2VisionConfig` configuration class: `Aimv2VisionModel` (Aimv2VisionModel model)
  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `AlbertModel` (ALBERT model)
  - `AlignConfig` configuration class: `AlignModel` (ALIGN model)
  - [AltCLIPConfig](/docs/transformers/v4.57.1/ko/model_doc/altclip#transformers.AltCLIPConfig) configuration class: [AltCLIPModel](/docs/transformers/v4.57.1/ko/model_doc/altclip#transformers.AltCLIPModel) (AltCLIP model)
  - `ApertusConfig` configuration class: `ApertusModel` (Apertus model)
  - `ArceeConfig` configuration class: `ArceeModel` (Arcee model)
  - `AriaConfig` configuration class: `AriaModel` (Aria model)
  - `AriaTextConfig` configuration class: `AriaTextModel` (AriaText model)
  - [AutoformerConfig](/docs/transformers/v4.57.1/ko/model_doc/autoformer#transformers.AutoformerConfig) configuration class: [AutoformerModel](/docs/transformers/v4.57.1/ko/model_doc/autoformer#transformers.AutoformerModel) (Autoformer model)
  - `AyaVisionConfig` configuration class: `AyaVisionModel` (AyaVision model)
  - `BambaConfig` configuration class: `BambaModel` (Bamba model)
  - `BarkConfig` configuration class: `BarkModel` (Bark model)
  - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [BartModel](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartModel) (BART model)
  - `BeitConfig` configuration class: `BeitModel` (BEiT model)
  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [BertModel](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertModel) (BERT model)
  - `BertGenerationConfig` configuration class: `BertGenerationEncoder` (Bert Generation model)
  - `BigBirdConfig` configuration class: `BigBirdModel` (BigBird model)
  - `BigBirdPegasusConfig` configuration class: `BigBirdPegasusModel` (BigBird-Pegasus model)
  - [BioGptConfig](/docs/transformers/v4.57.1/ko/model_doc/biogpt#transformers.BioGptConfig) configuration class: [BioGptModel](/docs/transformers/v4.57.1/ko/model_doc/biogpt#transformers.BioGptModel) (BioGpt model)
  - `BitConfig` configuration class: `BitModel` (BiT model)
  - `BitNetConfig` configuration class: `BitNetModel` (BitNet model)
  - `BlenderbotConfig` configuration class: `BlenderbotModel` (Blenderbot model)
  - `BlenderbotSmallConfig` configuration class: `BlenderbotSmallModel` (BlenderbotSmall model)
  - [Blip2Config](/docs/transformers/v4.57.1/ko/model_doc/blip-2#transformers.Blip2Config) configuration class: [Blip2Model](/docs/transformers/v4.57.1/ko/model_doc/blip-2#transformers.Blip2Model) (BLIP-2 model)
  - [Blip2QFormerConfig](/docs/transformers/v4.57.1/ko/model_doc/blip-2#transformers.Blip2QFormerConfig) configuration class: [Blip2QFormerModel](/docs/transformers/v4.57.1/ko/model_doc/blip-2#transformers.Blip2QFormerModel) (BLIP-2 QFormer model)
  - [BlipConfig](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.BlipConfig) configuration class: [BlipModel](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.BlipModel) (BLIP model)
  - `BloomConfig` configuration class: `BloomModel` (BLOOM model)
  - `BltConfig` configuration class: `BltModel` (Blt model)
  - `BridgeTowerConfig` configuration class: `BridgeTowerModel` (BridgeTower model)
  - `BrosConfig` configuration class: `BrosModel` (BROS model)
  - [CLIPConfig](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPConfig) configuration class: [CLIPModel](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPModel) (CLIP model)
  - [CLIPSegConfig](/docs/transformers/v4.57.1/ko/model_doc/clipseg#transformers.CLIPSegConfig) configuration class: [CLIPSegModel](/docs/transformers/v4.57.1/ko/model_doc/clipseg#transformers.CLIPSegModel) (CLIPSeg model)
  - [CLIPTextConfig](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPTextConfig) configuration class: [CLIPTextModel](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPTextModel) (CLIPTextModel model)
  - [CLIPVisionConfig](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPVisionConfig) configuration class: [CLIPVisionModel](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPVisionModel) (CLIPVisionModel model)
  - `CTRLConfig` configuration class: `CTRLModel` (CTRL model)
  - `CamembertConfig` configuration class: `CamembertModel` (CamemBERT model)
  - `CanineConfig` configuration class: `CanineModel` (CANINE model)
  - [ChameleonConfig](/docs/transformers/v4.57.1/ko/model_doc/chameleon#transformers.ChameleonConfig) configuration class: [ChameleonModel](/docs/transformers/v4.57.1/ko/model_doc/chameleon#transformers.ChameleonModel) (Chameleon model)
  - `ChineseCLIPConfig` configuration class: `ChineseCLIPModel` (Chinese-CLIP model)
  - `ChineseCLIPVisionConfig` configuration class: `ChineseCLIPVisionModel` (ChineseCLIPVisionModel model)
  - `ClapConfig` configuration class: `ClapModel` (CLAP model)
  - `ClvpConfig` configuration class: `ClvpModelForConditionalGeneration` (CLVP model)
  - [CodeGenConfig](/docs/transformers/v4.57.1/ko/model_doc/codegen#transformers.CodeGenConfig) configuration class: [CodeGenModel](/docs/transformers/v4.57.1/ko/model_doc/codegen#transformers.CodeGenModel) (CodeGen model)
  - `Cohere2Config` configuration class: `Cohere2Model` (Cohere2 model)
  - `Cohere2VisionConfig` configuration class: `Cohere2VisionModel` (Cohere2Vision model)
  - [CohereConfig](/docs/transformers/v4.57.1/ko/model_doc/cohere#transformers.CohereConfig) configuration class: [CohereModel](/docs/transformers/v4.57.1/ko/model_doc/cohere#transformers.CohereModel) (Cohere model)
  - `ConditionalDetrConfig` configuration class: `ConditionalDetrModel` (Conditional DETR model)
  - [ConvBertConfig](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertModel](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertModel) (ConvBERT model)
  - `ConvNextConfig` configuration class: `ConvNextModel` (ConvNeXT model)
  - `ConvNextV2Config` configuration class: `ConvNextV2Model` (ConvNeXTV2 model)
  - `CpmAntConfig` configuration class: `CpmAntModel` (CPM-Ant model)
  - `CsmConfig` configuration class: `CsmForConditionalGeneration` (CSM model)
  - `CvtConfig` configuration class: `CvtModel` (CvT model)
  - `DFineConfig` configuration class: `DFineModel` (D-FINE model)
  - `DINOv3ConvNextConfig` configuration class: `DINOv3ConvNextModel` (DINOv3 ConvNext model)
  - `DINOv3ViTConfig` configuration class: `DINOv3ViTModel` (DINOv3 ViT model)
  - `DPRConfig` configuration class: `DPRQuestionEncoder` (DPR model)
  - `DPTConfig` configuration class: `DPTModel` (DPT model)
  - `DabDetrConfig` configuration class: `DabDetrModel` (DAB-DETR model)
  - `DacConfig` configuration class: `DacModel` (DAC model)
  - `Data2VecAudioConfig` configuration class: `Data2VecAudioModel` (Data2VecAudio model)
  - `Data2VecTextConfig` configuration class: `Data2VecTextModel` (Data2VecText model)
  - `Data2VecVisionConfig` configuration class: `Data2VecVisionModel` (Data2VecVision model)
  - [DbrxConfig](/docs/transformers/v4.57.1/ko/model_doc/dbrx#transformers.DbrxConfig) configuration class: [DbrxModel](/docs/transformers/v4.57.1/ko/model_doc/dbrx#transformers.DbrxModel) (DBRX model)
  - [DebertaConfig](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaModel](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaModel) (DeBERTa model)
  - [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2Model](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Model) (DeBERTa-v2 model)
  - `DecisionTransformerConfig` configuration class: `DecisionTransformerModel` (Decision Transformer model)
  - `DeepseekV2Config` configuration class: `DeepseekV2Model` (DeepSeek-V2 model)
  - [DeepseekV3Config](/docs/transformers/v4.57.1/ko/model_doc/deepseek_v3#transformers.DeepseekV3Config) configuration class: [DeepseekV3Model](/docs/transformers/v4.57.1/ko/model_doc/deepseek_v3#transformers.DeepseekV3Model) (DeepSeek-V3 model)
  - `DeepseekVLConfig` configuration class: `DeepseekVLModel` (DeepseekVL model)
  - `DeepseekVLHybridConfig` configuration class: `DeepseekVLHybridModel` (DeepseekVLHybrid model)
  - `DeformableDetrConfig` configuration class: `DeformableDetrModel` (Deformable DETR model)
  - `DeiTConfig` configuration class: `DeiTModel` (DeiT model)
  - `DepthProConfig` configuration class: `DepthProModel` (DepthPro model)
  - `DetaConfig` configuration class: `DetaModel` (DETA model)
  - `DetrConfig` configuration class: `DetrModel` (DETR model)
  - `DiaConfig` configuration class: `DiaModel` (Dia model)
  - `DiffLlamaConfig` configuration class: `DiffLlamaModel` (DiffLlama model)
  - `DinatConfig` configuration class: `DinatModel` (DiNAT model)
  - `Dinov2Config` configuration class: `Dinov2Model` (DINOv2 model)
  - `Dinov2WithRegistersConfig` configuration class: `Dinov2WithRegistersModel` (DINOv2 with Registers model)
  - `DistilBertConfig` configuration class: `DistilBertModel` (DistilBERT model)
  - `DogeConfig` configuration class: `DogeModel` (Doge model)
  - `DonutSwinConfig` configuration class: `DonutSwinModel` (DonutSwin model)
  - `Dots1Config` configuration class: `Dots1Model` (dots1 model)
  - `EdgeTamConfig` configuration class: `EdgeTamModel` (EdgeTAM model)
  - `EdgeTamVideoConfig` configuration class: `EdgeTamVideoModel` (EdgeTamVideo model)
  - `EdgeTamVisionConfig` configuration class: `EdgeTamVisionModel` (EdgeTamVisionModel model)
  - `EfficientFormerConfig` configuration class: `EfficientFormerModel` (EfficientFormer model)
  - `EfficientLoFTRConfig` configuration class: `EfficientLoFTRModel` (EfficientLoFTR model)
  - `EfficientNetConfig` configuration class: `EfficientNetModel` (EfficientNet model)
  - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraModel](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraModel) (ELECTRA model)
  - `Emu3Config` configuration class: `Emu3Model` (Emu3 model)
  - `EncodecConfig` configuration class: `EncodecModel` (EnCodec model)
  - `Ernie4_5Config` configuration class: `Ernie4_5Model` (Ernie4_5 model)
  - `Ernie4_5_MoeConfig` configuration class: `Ernie4_5_MoeModel` (Ernie4_5_MoE model)
  - `ErnieConfig` configuration class: `ErnieModel` (ERNIE model)
  - `ErnieMConfig` configuration class: `ErnieMModel` (ErnieM model)
  - [EsmConfig](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmConfig) configuration class: [EsmModel](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmModel) (ESM model)
  - `EvollaConfig` configuration class: `EvollaModel` (Evolla model)
  - [Exaone4Config](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4Config) configuration class: [Exaone4Model](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4Model) (EXAONE-4.0 model)
  - `FNetConfig` configuration class: `FNetModel` (FNet model)
  - `FSMTConfig` configuration class: `FSMTModel` (FairSeq Machine-Translation model)
  - `FalconConfig` configuration class: `FalconModel` (Falcon model)
  - `FalconH1Config` configuration class: `FalconH1Model` (FalconH1 model)
  - `FalconMambaConfig` configuration class: `FalconMambaModel` (FalconMamba model)
  - `FastSpeech2ConformerConfig` configuration class: `FastSpeech2ConformerModel` (FastSpeech2Conformer model)
  - `FastSpeech2ConformerWithHifiGanConfig` configuration class: `FastSpeech2ConformerWithHifiGan` (FastSpeech2ConformerWithHifiGan model)
  - `FlaubertConfig` configuration class: `FlaubertModel` (FlauBERT model)
  - `FlavaConfig` configuration class: `FlavaModel` (FLAVA model)
  - `FlexOlmoConfig` configuration class: `FlexOlmoModel` (FlexOlmo model)
  - `Florence2Config` configuration class: `Florence2Model` (Florence2 model)
  - `FocalNetConfig` configuration class: `FocalNetModel` (FocalNet model)
  - `FunnelConfig` configuration class: `FunnelModel` or `FunnelBaseModel` (Funnel Transformer model)
  - `FuyuConfig` configuration class: `FuyuModel` (Fuyu model)
  - `GLPNConfig` configuration class: `GLPNModel` (GLPN model)
  - [GPT2Config](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Config) configuration class: [GPT2Model](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Model) (OpenAI GPT-2 model)
  - `GPTBigCodeConfig` configuration class: `GPTBigCodeModel` (GPTBigCode model)
  - `GPTJConfig` configuration class: `GPTJModel` (GPT-J model)
  - `GPTNeoConfig` configuration class: `GPTNeoModel` (GPT Neo model)
  - `GPTNeoXConfig` configuration class: `GPTNeoXModel` (GPT NeoX model)
  - [GPTNeoXJapaneseConfig](/docs/transformers/v4.57.1/ko/model_doc/gpt_neox_japanese#transformers.GPTNeoXJapaneseConfig) configuration class: [GPTNeoXJapaneseModel](/docs/transformers/v4.57.1/ko/model_doc/gpt_neox_japanese#transformers.GPTNeoXJapaneseModel) (GPT NeoX Japanese model)
  - `GPTSanJapaneseConfig` configuration class: `GPTSanJapaneseForConditionalGeneration` (GPTSAN-japanese model)
  - [Gemma2Config](/docs/transformers/v4.57.1/ko/model_doc/gemma2#transformers.Gemma2Config) configuration class: [Gemma2Model](/docs/transformers/v4.57.1/ko/model_doc/gemma2#transformers.Gemma2Model) (Gemma2 model)
  - [Gemma3Config](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3Config) configuration class: [Gemma3Model](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3Model) (Gemma3ForConditionalGeneration model)
  - [Gemma3TextConfig](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3TextConfig) configuration class: [Gemma3TextModel](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3TextModel) (Gemma3ForCausalLM model)
  - `Gemma3nAudioConfig` configuration class: `Gemma3nAudioEncoder` (Gemma3nAudioEncoder model)
  - `Gemma3nConfig` configuration class: `Gemma3nModel` (Gemma3nForConditionalGeneration model)
  - `Gemma3nTextConfig` configuration class: `Gemma3nTextModel` (Gemma3nForCausalLM model)
  - `Gemma3nVisionConfig` configuration class: `TimmWrapperModel` (TimmWrapperModel model)
  - [GemmaConfig](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaConfig) configuration class: [GemmaModel](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaModel) (Gemma model)
  - `GitConfig` configuration class: `GitModel` (GIT model)
  - `Glm4Config` configuration class: `Glm4Model` (GLM4 model)
  - `Glm4MoeConfig` configuration class: `Glm4MoeModel` (Glm4MoE model)
  - `Glm4vConfig` configuration class: `Glm4vModel` (GLM4V model)
  - `Glm4vMoeConfig` configuration class: `Glm4vMoeModel` (GLM4VMOE model)
  - `Glm4vMoeTextConfig` configuration class: `Glm4vMoeTextModel` (GLM4VMOE model)
  - `Glm4vTextConfig` configuration class: `Glm4vTextModel` (GLM4V model)
  - `GlmConfig` configuration class: `GlmModel` (GLM model)
  - `GotOcr2Config` configuration class: `GotOcr2Model` (GOT-OCR2 model)
  - `GptOssConfig` configuration class: `GptOssModel` (GptOss model)
  - `GraniteConfig` configuration class: `GraniteModel` (Granite model)
  - `GraniteMoeConfig` configuration class: `GraniteMoeModel` (GraniteMoeMoe model)
  - `GraniteMoeHybridConfig` configuration class: `GraniteMoeHybridModel` (GraniteMoeHybrid model)
  - `GraniteMoeSharedConfig` configuration class: `GraniteMoeSharedModel` (GraniteMoeSharedMoe model)
  - [GraphormerConfig](/docs/transformers/v4.57.1/ko/model_doc/graphormer#transformers.GraphormerConfig) configuration class: [GraphormerModel](/docs/transformers/v4.57.1/ko/model_doc/graphormer#transformers.GraphormerModel) (Graphormer model)
  - [GroundingDinoConfig](/docs/transformers/v4.57.1/ko/model_doc/grounding-dino#transformers.GroundingDinoConfig) configuration class: [GroundingDinoModel](/docs/transformers/v4.57.1/ko/model_doc/grounding-dino#transformers.GroundingDinoModel) (Grounding DINO model)
  - `GroupViTConfig` configuration class: `GroupViTModel` (GroupViT model)
  - `HGNetV2Config` configuration class: `HGNetV2Backbone` (HGNet-V2 model)
  - `HeliumConfig` configuration class: `HeliumModel` (Helium model)
  - `HieraConfig` configuration class: `HieraModel` (Hiera model)
  - `HubertConfig` configuration class: `HubertModel` (Hubert model)
  - `HunYuanDenseV1Config` configuration class: `HunYuanDenseV1Model` (HunYuanDenseV1 model)
  - `HunYuanMoEV1Config` configuration class: `HunYuanMoEV1Model` (HunYuanMoeV1 model)
  - `IBertConfig` configuration class: `IBertModel` (I-BERT model)
  - `IJepaConfig` configuration class: `IJepaModel` (I-JEPA model)
  - `Idefics2Config` configuration class: `Idefics2Model` (Idefics2 model)
  - `Idefics3Config` configuration class: `Idefics3Model` (Idefics3 model)
  - `Idefics3VisionConfig` configuration class: `Idefics3VisionTransformer` (Idefics3VisionTransformer model)
  - `IdeficsConfig` configuration class: `IdeficsModel` (IDEFICS model)
  - `ImageGPTConfig` configuration class: `ImageGPTModel` (ImageGPT model)
  - [InformerConfig](/docs/transformers/v4.57.1/ko/model_doc/informer#transformers.InformerConfig) configuration class: [InformerModel](/docs/transformers/v4.57.1/ko/model_doc/informer#transformers.InformerModel) (Informer model)
  - `InstructBlipConfig` configuration class: `InstructBlipModel` (InstructBLIP model)
  - `InstructBlipVideoConfig` configuration class: `InstructBlipVideoModel` (InstructBlipVideo model)
  - `InternVLConfig` configuration class: `InternVLModel` (InternVL model)
  - `InternVLVisionConfig` configuration class: `InternVLVisionModel` (InternVLVision model)
  - [JambaConfig](/docs/transformers/v4.57.1/ko/model_doc/jamba#transformers.JambaConfig) configuration class: [JambaModel](/docs/transformers/v4.57.1/ko/model_doc/jamba#transformers.JambaModel) (Jamba model)
  - `JanusConfig` configuration class: `JanusModel` (Janus model)
  - `JetMoeConfig` configuration class: `JetMoeModel` (JetMoe model)
  - `JukeboxConfig` configuration class: `JukeboxModel` (Jukebox model)
  - `Kosmos2Config` configuration class: `Kosmos2Model` (KOSMOS-2 model)
  - `Kosmos2_5Config` configuration class: `Kosmos2_5Model` (KOSMOS-2.5 model)
  - `KyutaiSpeechToTextConfig` configuration class: `KyutaiSpeechToTextModel` (KyutaiSpeechToText model)
  - `LEDConfig` configuration class: `LEDModel` (LED model)
  - `LayoutLMConfig` configuration class: `LayoutLMModel` (LayoutLM model)
  - `LayoutLMv2Config` configuration class: `LayoutLMv2Model` (LayoutLMv2 model)
  - `LayoutLMv3Config` configuration class: `LayoutLMv3Model` (LayoutLMv3 model)
  - `LevitConfig` configuration class: `LevitModel` (LeViT model)
  - `Lfm2Config` configuration class: `Lfm2Model` (Lfm2 model)
  - `Lfm2VlConfig` configuration class: `Lfm2VlModel` (Lfm2Vl model)
  - `LightGlueConfig` configuration class: `LightGlueForKeypointMatching` (LightGlue model)
  - `LiltConfig` configuration class: `LiltModel` (LiLT model)
  - `Llama4Config` configuration class: `Llama4ForConditionalGeneration` (Llama4 model)
  - `Llama4TextConfig` configuration class: `Llama4TextModel` (Llama4ForCausalLM model)
  - [LlamaConfig](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaConfig) configuration class: [LlamaModel](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaModel) (LLaMA model)
  - `LlavaConfig` configuration class: `LlavaModel` (LLaVa model)
  - `LlavaNextConfig` configuration class: `LlavaNextModel` (LLaVA-NeXT model)
  - `LlavaNextVideoConfig` configuration class: `LlavaNextVideoModel` (LLaVa-NeXT-Video model)
  - `LlavaOnevisionConfig` configuration class: `LlavaOnevisionModel` (LLaVA-Onevision model)
  - `LongT5Config` configuration class: `LongT5Model` (LongT5 model)
  - `LongcatFlashConfig` configuration class: `LongcatFlashModel` (LongCatFlash model)
  - `LongformerConfig` configuration class: `LongformerModel` (Longformer model)
  - `LukeConfig` configuration class: `LukeModel` (LUKE model)
  - `LxmertConfig` configuration class: `LxmertModel` (LXMERT model)
  - `M2M100Config` configuration class: `M2M100Model` (M2M100 model)
  - `MBartConfig` configuration class: `MBartModel` (mBART model)
  - `MCTCTConfig` configuration class: `MCTCTModel` (M-CTC-T model)
  - `MLCDVisionConfig` configuration class: `MLCDVisionModel` (MLCD model)
  - `MMGroundingDinoConfig` configuration class: `MMGroundingDinoModel` (MM Grounding DINO model)
  - `MPNetConfig` configuration class: `MPNetModel` (MPNet model)
  - `MT5Config` configuration class: `MT5Model` (MT5 model)
  - [Mamba2Config](/docs/transformers/v4.57.1/ko/model_doc/mamba2#transformers.Mamba2Config) configuration class: [Mamba2Model](/docs/transformers/v4.57.1/ko/model_doc/mamba2#transformers.Mamba2Model) (mamba2 model)
  - [MambaConfig](/docs/transformers/v4.57.1/ko/model_doc/mamba#transformers.MambaConfig) configuration class: [MambaModel](/docs/transformers/v4.57.1/ko/model_doc/mamba#transformers.MambaModel) (Mamba model)
  - [MarianConfig](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.MarianConfig) configuration class: [MarianModel](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.MarianModel) (Marian model)
  - `MarkupLMConfig` configuration class: `MarkupLMModel` (MarkupLM model)
  - `Mask2FormerConfig` configuration class: `Mask2FormerModel` (Mask2Former model)
  - `MaskFormerConfig` configuration class: `MaskFormerModel` (MaskFormer model)
  - `MaskFormerSwinConfig` configuration class: `MaskFormerSwinModel` (MaskFormerSwin model)
  - `MegaConfig` configuration class: `MegaModel` (MEGA model)
  - `MegatronBertConfig` configuration class: `MegatronBertModel` (Megatron-BERT model)
  - `MetaClip2Config` configuration class: `MetaClip2Model` (MetaCLIP 2 model)
  - `MgpstrConfig` configuration class: `MgpstrForSceneTextRecognition` (MGP-STR model)
  - `MimiConfig` configuration class: `MimiModel` (Mimi model)
  - `MiniMaxConfig` configuration class: `MiniMaxModel` (MiniMax model)
  - `MinistralConfig` configuration class: `MinistralModel` (Ministral model)
  - `Mistral3Config` configuration class: `Mistral3Model` (Mistral3 model)
  - [MistralConfig](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralConfig) configuration class: [MistralModel](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralModel) (Mistral model)
  - `MixtralConfig` configuration class: `MixtralModel` (Mixtral model)
  - `MllamaConfig` configuration class: `MllamaModel` (Mllama model)
  - `MobileBertConfig` configuration class: `MobileBertModel` (MobileBERT model)
  - `MobileNetV1Config` configuration class: `MobileNetV1Model` (MobileNetV1 model)
  - `MobileNetV2Config` configuration class: `MobileNetV2Model` (MobileNetV2 model)
  - `MobileViTConfig` configuration class: `MobileViTModel` (MobileViT model)
  - `MobileViTV2Config` configuration class: `MobileViTV2Model` (MobileViTV2 model)
  - `ModernBertConfig` configuration class: `ModernBertModel` (ModernBERT model)
  - `ModernBertDecoderConfig` configuration class: `ModernBertDecoderModel` (ModernBertDecoder model)
  - `MoonshineConfig` configuration class: `MoonshineModel` (Moonshine model)
  - `MoshiConfig` configuration class: `MoshiModel` (Moshi model)
  - `MptConfig` configuration class: `MptModel` (MPT model)
  - `MraConfig` configuration class: `MraModel` (MRA model)
  - `MusicgenConfig` configuration class: `MusicgenModel` (MusicGen model)
  - `MusicgenMelodyConfig` configuration class: `MusicgenMelodyModel` (MusicGen Melody model)
  - `MvpConfig` configuration class: `MvpModel` (MVP model)
  - `NatConfig` configuration class: `NatModel` (NAT model)
  - `NemotronConfig` configuration class: `NemotronModel` (Nemotron model)
  - `NezhaConfig` configuration class: `NezhaModel` (Nezha model)
  - `NllbMoeConfig` configuration class: `NllbMoeModel` (NLLB-MOE model)
  - `NystromformerConfig` configuration class: `NystromformerModel` (Nyströmformer model)
  - `OPTConfig` configuration class: `OPTModel` (OPT model)
  - `Olmo2Config` configuration class: `Olmo2Model` (OLMo2 model)
  - `Olmo3Config` configuration class: `Olmo3Model` (Olmo3 model)
  - `OlmoConfig` configuration class: `OlmoModel` (OLMo model)
  - `OlmoeConfig` configuration class: `OlmoeModel` (OLMoE model)
  - `OmDetTurboConfig` configuration class: `OmDetTurboForObjectDetection` (OmDet-Turbo model)
  - `OneFormerConfig` configuration class: `OneFormerModel` (OneFormer model)
  - [OpenAIGPTConfig](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTConfig) configuration class: [OpenAIGPTModel](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTModel) (OpenAI GPT model)
  - `OpenLlamaConfig` configuration class: `OpenLlamaModel` (OpenLlama model)
  - `Ovis2Config` configuration class: `Ovis2Model` (Ovis2 model)
  - `OwlViTConfig` configuration class: `OwlViTModel` (OWL-ViT model)
  - `Owlv2Config` configuration class: `Owlv2Model` (OWLv2 model)
  - `PLBartConfig` configuration class: `PLBartModel` (PLBart model)
  - [PaliGemmaConfig](/docs/transformers/v4.57.1/ko/model_doc/paligemma#transformers.PaliGemmaConfig) configuration class: `PaliGemmaModel` (PaliGemma model)
  - `ParakeetCTCConfig` configuration class: `ParakeetForCTC` (Parakeet model)
  - `ParakeetEncoderConfig` configuration class: `ParakeetEncoder` (ParakeetEncoder model)
  - [PatchTSMixerConfig](/docs/transformers/v4.57.1/ko/model_doc/patchtsmixer#transformers.PatchTSMixerConfig) configuration class: [PatchTSMixerModel](/docs/transformers/v4.57.1/ko/model_doc/patchtsmixer#transformers.PatchTSMixerModel) (PatchTSMixer model)
  - [PatchTSTConfig](/docs/transformers/v4.57.1/ko/model_doc/patchtst#transformers.PatchTSTConfig) configuration class: [PatchTSTModel](/docs/transformers/v4.57.1/ko/model_doc/patchtst#transformers.PatchTSTModel) (PatchTST model)
  - `PegasusConfig` configuration class: `PegasusModel` (Pegasus model)
  - `PegasusXConfig` configuration class: `PegasusXModel` (PEGASUS-X model)
  - `PerceiverConfig` configuration class: `PerceiverModel` (Perceiver model)
  - `PerceptionLMConfig` configuration class: `PerceptionLMModel` (PerceptionLM model)
  - `PersimmonConfig` configuration class: `PersimmonModel` (Persimmon model)
  - `Phi3Config` configuration class: `Phi3Model` (Phi3 model)
  - `Phi4MultimodalConfig` configuration class: `Phi4MultimodalModel` (Phi4Multimodal model)
  - `PhiConfig` configuration class: `PhiModel` (Phi model)
  - `PhimoeConfig` configuration class: `PhimoeModel` (Phimoe model)
  - `PixtralVisionConfig` configuration class: `PixtralVisionModel` (Pixtral model)
  - `PoolFormerConfig` configuration class: `PoolFormerModel` (PoolFormer model)
  - `ProphetNetConfig` configuration class: `ProphetNetModel` (ProphetNet model)
  - `PvtConfig` configuration class: `PvtModel` (PVT model)
  - `PvtV2Config` configuration class: `PvtV2Model` (PVTv2 model)
  - `QDQBertConfig` configuration class: `QDQBertModel` (QDQBert model)
  - `Qwen2AudioEncoderConfig` configuration class: `Qwen2AudioEncoder` (Qwen2AudioEncoder model)
  - `Qwen2Config` configuration class: `Qwen2Model` (Qwen2 model)
  - `Qwen2MoeConfig` configuration class: `Qwen2MoeModel` (Qwen2MoE model)
  - [Qwen2VLConfig](/docs/transformers/v4.57.1/ko/model_doc/qwen2_vl#transformers.Qwen2VLConfig) configuration class: [Qwen2VLModel](/docs/transformers/v4.57.1/ko/model_doc/qwen2_vl#transformers.Qwen2VLModel) (Qwen2VL model)
  - `Qwen2VLTextConfig` configuration class: `Qwen2VLTextModel` (Qwen2VL model)
  - `Qwen2_5_VLConfig` configuration class: `Qwen2_5_VLModel` (Qwen2_5_VL model)
  - `Qwen2_5_VLTextConfig` configuration class: `Qwen2_5_VLTextModel` (Qwen2_5_VL model)
  - `Qwen3Config` configuration class: `Qwen3Model` (Qwen3 model)
  - `Qwen3MoeConfig` configuration class: `Qwen3MoeModel` (Qwen3MoE model)
  - `Qwen3NextConfig` configuration class: `Qwen3NextModel` (Qwen3Next model)
  - `Qwen3VLConfig` configuration class: `Qwen3VLModel` (Qwen3VL model)
  - `Qwen3VLMoeConfig` configuration class: `Qwen3VLMoeModel` (Qwen3VLMoe model)
  - `Qwen3VLMoeTextConfig` configuration class: `Qwen3VLMoeTextModel` (Qwen3VLMoe model)
  - `Qwen3VLTextConfig` configuration class: `Qwen3VLTextModel` (Qwen3VL model)
  - `RTDetrConfig` configuration class: `RTDetrModel` (RT-DETR model)
  - `RTDetrV2Config` configuration class: `RTDetrV2Model` (RT-DETRv2 model)
  - `RecurrentGemmaConfig` configuration class: `RecurrentGemmaModel` (RecurrentGemma model)
  - `ReformerConfig` configuration class: `ReformerModel` (Reformer model)
  - `RegNetConfig` configuration class: `RegNetModel` (RegNet model)
  - `RemBertConfig` configuration class: `RemBertModel` (RemBERT model)
  - `ResNetConfig` configuration class: `ResNetModel` (ResNet model)
  - `RetriBertConfig` configuration class: `RetriBertModel` (RetriBERT model)
  - `RoCBertConfig` configuration class: `RoCBertModel` (RoCBert model)
  - `RoFormerConfig` configuration class: `RoFormerModel` (RoFormer model)
  - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaModel](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaModel) (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormModel` (RoBERTa-PreLayerNorm model)
  - `RwkvConfig` configuration class: `RwkvModel` (RWKV model)
  - `SEWConfig` configuration class: `SEWModel` (SEW model)
  - `SEWDConfig` configuration class: `SEWDModel` (SEW-D model)
  - `Sam2Config` configuration class: `Sam2Model` (SAM2 model)
  - `Sam2HieraDetConfig` configuration class: `Sam2HieraDetModel` (Sam2HieraDetModel model)
  - `Sam2VideoConfig` configuration class: `Sam2VideoModel` (Sam2VideoModel model)
  - `Sam2VisionConfig` configuration class: `Sam2VisionModel` (Sam2VisionModel model)
  - `SamConfig` configuration class: `SamModel` (SAM model)
  - `SamHQConfig` configuration class: `SamHQModel` (SAM-HQ model)
  - `SamHQVisionConfig` configuration class: `SamHQVisionModel` (SamHQVisionModel model)
  - `SamVisionConfig` configuration class: `SamVisionModel` (SamVisionModel model)
  - `SeamlessM4TConfig` configuration class: `SeamlessM4TModel` (SeamlessM4T model)
  - `SeamlessM4Tv2Config` configuration class: `SeamlessM4Tv2Model` (SeamlessM4Tv2 model)
  - `SeedOssConfig` configuration class: `SeedOssModel` (SeedOss model)
  - `SegGptConfig` configuration class: `SegGptModel` (SegGPT model)
  - `SegformerConfig` configuration class: `SegformerModel` (SegFormer model)
  - `Siglip2Config` configuration class: `Siglip2Model` (SigLIP2 model)
  - `Siglip2VisionConfig` configuration class: `Siglip2VisionModel` (Siglip2VisionModel model)
  - [SiglipConfig](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipConfig) configuration class: [SiglipModel](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipModel) (SigLIP model)
  - [SiglipVisionConfig](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipVisionConfig) configuration class: [SiglipVisionModel](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipVisionModel) (SiglipVisionModel model)
  - `SmolLM3Config` configuration class: `SmolLM3Model` (SmolLM3 model)
  - [SmolVLMConfig](/docs/transformers/v4.57.1/ko/model_doc/smolvlm#transformers.SmolVLMConfig) configuration class: [SmolVLMModel](/docs/transformers/v4.57.1/ko/model_doc/smolvlm#transformers.SmolVLMModel) (SmolVLM model)
  - [SmolVLMVisionConfig](/docs/transformers/v4.57.1/ko/model_doc/smolvlm#transformers.SmolVLMVisionConfig) configuration class: [SmolVLMVisionTransformer](/docs/transformers/v4.57.1/ko/model_doc/smolvlm#transformers.SmolVLMVisionTransformer) (SmolVLMVisionTransformer model)
  - `Speech2TextConfig` configuration class: `Speech2TextModel` (Speech2Text model)
  - `SpeechT5Config` configuration class: `SpeechT5Model` (SpeechT5 model)
  - `SplinterConfig` configuration class: `SplinterModel` (Splinter model)
  - `SqueezeBertConfig` configuration class: `SqueezeBertModel` (SqueezeBERT model)
  - `StableLmConfig` configuration class: `StableLmModel` (StableLm model)
  - `Starcoder2Config` configuration class: `Starcoder2Model` (Starcoder2 model)
  - `SwiftFormerConfig` configuration class: `SwiftFormerModel` (SwiftFormer model)
  - [Swin2SRConfig](/docs/transformers/v4.57.1/ko/model_doc/swin2sr#transformers.Swin2SRConfig) configuration class: [Swin2SRModel](/docs/transformers/v4.57.1/ko/model_doc/swin2sr#transformers.Swin2SRModel) (Swin2SR model)
  - [SwinConfig](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.SwinConfig) configuration class: [SwinModel](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.SwinModel) (Swin Transformer model)
  - [Swinv2Config](/docs/transformers/v4.57.1/ko/model_doc/swinv2#transformers.Swinv2Config) configuration class: [Swinv2Model](/docs/transformers/v4.57.1/ko/model_doc/swinv2#transformers.Swinv2Model) (Swin Transformer V2 model)
  - `SwitchTransformersConfig` configuration class: `SwitchTransformersModel` (SwitchTransformers model)
  - `T5Config` configuration class: `T5Model` (T5 model)
  - `T5GemmaConfig` configuration class: `T5GemmaModel` (T5Gemma model)
  - `TableTransformerConfig` configuration class: `TableTransformerModel` (Table Transformer model)
  - `TapasConfig` configuration class: `TapasModel` (TAPAS model)
  - `TextNetConfig` configuration class: `TextNetModel` (TextNet model)
  - [TimeSeriesTransformerConfig](/docs/transformers/v4.57.1/ko/model_doc/time_series_transformer#transformers.TimeSeriesTransformerConfig) configuration class: [TimeSeriesTransformerModel](/docs/transformers/v4.57.1/ko/model_doc/time_series_transformer#transformers.TimeSeriesTransformerModel) (Time Series Transformer model)
  - `TimesFmConfig` configuration class: `TimesFmModel` (TimesFm model)
  - [TimesformerConfig](/docs/transformers/v4.57.1/ko/model_doc/timesformer#transformers.TimesformerConfig) configuration class: [TimesformerModel](/docs/transformers/v4.57.1/ko/model_doc/timesformer#transformers.TimesformerModel) (TimeSformer model)
  - `TimmBackboneConfig` configuration class: `TimmBackbone` (TimmBackbone model)
  - `TimmWrapperConfig` configuration class: `TimmWrapperModel` (TimmWrapperModel model)
  - [TrajectoryTransformerConfig](/docs/transformers/v4.57.1/ko/model_doc/trajectory_transformer#transformers.TrajectoryTransformerConfig) configuration class: [TrajectoryTransformerModel](/docs/transformers/v4.57.1/ko/model_doc/trajectory_transformer#transformers.TrajectoryTransformerModel) (Trajectory Transformer model)
  - `TransfoXLConfig` configuration class: `TransfoXLModel` (Transformer-XL model)
  - `TvltConfig` configuration class: `TvltModel` (TVLT model)
  - [TvpConfig](/docs/transformers/v4.57.1/ko/model_doc/tvp#transformers.TvpConfig) configuration class: [TvpModel](/docs/transformers/v4.57.1/ko/model_doc/tvp#transformers.TvpModel) (TVP model)
  - `UMT5Config` configuration class: `UMT5Model` (UMT5 model)
  - `UdopConfig` configuration class: `UdopModel` (UDOP model)
  - `UniSpeechConfig` configuration class: `UniSpeechModel` (UniSpeech model)
  - `UniSpeechSatConfig` configuration class: `UniSpeechSatModel` (UniSpeechSat model)
  - `UnivNetConfig` configuration class: `UnivNetModel` (UnivNet model)
  - `VJEPA2Config` configuration class: `VJEPA2Model` (VJEPA2Model model)
  - `VanConfig` configuration class: `VanModel` (VAN model)
  - `VaultGemmaConfig` configuration class: `VaultGemmaModel` (VaultGemma model)
  - [ViTConfig](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTConfig) configuration class: [ViTModel](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTModel) (ViT model)
  - `ViTHybridConfig` configuration class: `ViTHybridModel` (ViT Hybrid model)
  - `ViTMAEConfig` configuration class: `ViTMAEModel` (ViTMAE model)
  - `ViTMSNConfig` configuration class: `ViTMSNModel` (ViTMSN model)
  - `VideoLlavaConfig` configuration class: `VideoLlavaModel` (VideoLlava model)
  - `VideoMAEConfig` configuration class: `VideoMAEModel` (VideoMAE model)
  - `ViltConfig` configuration class: `ViltModel` (ViLT model)
  - `VipLlavaConfig` configuration class: `VipLlavaModel` (VipLlava model)
  - `VisionTextDualEncoderConfig` configuration class: `VisionTextDualEncoderModel` (VisionTextDualEncoder model)
  - `VisualBertConfig` configuration class: `VisualBertModel` (VisualBERT model)
  - `VitDetConfig` configuration class: `VitDetModel` (VitDet model)
  - `VitsConfig` configuration class: `VitsModel` (VITS model)
  - [VivitConfig](/docs/transformers/v4.57.1/ko/model_doc/vivit#transformers.VivitConfig) configuration class: [VivitModel](/docs/transformers/v4.57.1/ko/model_doc/vivit#transformers.VivitModel) (ViViT model)
  - `VoxtralConfig` configuration class: `VoxtralForConditionalGeneration` (Voxtral model)
  - `VoxtralEncoderConfig` configuration class: `VoxtralEncoder` (Voxtral Encoder model)
  - `Wav2Vec2BertConfig` configuration class: `Wav2Vec2BertModel` (Wav2Vec2-BERT model)
  - `Wav2Vec2Config` configuration class: `Wav2Vec2Model` (Wav2Vec2 model)
  - `Wav2Vec2ConformerConfig` configuration class: `Wav2Vec2ConformerModel` (Wav2Vec2-Conformer model)
  - `WavLMConfig` configuration class: `WavLMModel` (WavLM model)
  - [WhisperConfig](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperConfig) configuration class: [WhisperModel](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperModel) (Whisper model)
  - [XCLIPConfig](/docs/transformers/v4.57.1/ko/model_doc/xclip#transformers.XCLIPConfig) configuration class: [XCLIPModel](/docs/transformers/v4.57.1/ko/model_doc/xclip#transformers.XCLIPModel) (X-CLIP model)
  - `XGLMConfig` configuration class: `XGLMModel` (XGLM model)
  - `XLMConfig` configuration class: `XLMModel` (XLM model)
  - `XLMProphetNetConfig` configuration class: `XLMProphetNetModel` (XLM-ProphetNet model)
  - `XLMRobertaConfig` configuration class: `XLMRobertaModel` (XLM-RoBERTa model)
  - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLModel` (XLM-RoBERTa-XL model)
  - `XLNetConfig` configuration class: `XLNetModel` (XLNet model)
  - `XcodecConfig` configuration class: `XcodecModel` (X-CODEC model)
  - `XmodConfig` configuration class: `XmodModel` (X-MOD model)
  - `YolosConfig` configuration class: `YolosModel` (YOLOS model)
  - `YosoConfig` configuration class: `YosoModel` (YOSO model)
  - `Zamba2Config` configuration class: `Zamba2Model` (Zamba2 model)
  - `ZambaConfig` configuration class: `ZambaModel` (Zamba model)
  - `xLSTMConfig` configuration class: `xLSTMModel` (xLSTM model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the base model classes of the library from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModel

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModel.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `ASTConfig` configuration class: `ASTModel` (Audio Spectrogram Transformer model) - `Aimv2Config` configuration class: `Aimv2Model` (AIMv2 model) - `Aimv2VisionConfig` configuration class: `Aimv2VisionModel` (Aimv2VisionModel model) - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `AlbertModel` (ALBERT model) - `AlignConfig` configuration class: `AlignModel` (ALIGN model) - [AltCLIPConfig](/docs/transformers/v4.57.1/ko/model_doc/altclip#transformers.AltCLIPConfig) configuration class: [AltCLIPModel](/docs/transformers/v4.57.1/ko/model_doc/altclip#transformers.AltCLIPModel) (AltCLIP model) - `ApertusConfig` configuration class: `ApertusModel` (Apertus model) - `ArceeConfig` configuration class: `ArceeModel` (Arcee model) - `AriaConfig` configuration class: `AriaModel` (Aria model) - `AriaTextConfig` configuration class: `AriaTextModel` (AriaText model) - [AutoformerConfig](/docs/transformers/v4.57.1/ko/model_doc/autoformer#transformers.AutoformerConfig) configuration class: [AutoformerModel](/docs/transformers/v4.57.1/ko/model_doc/autoformer#transformers.AutoformerModel) (Autoformer model) - `AyaVisionConfig` configuration class: `AyaVisionModel` (AyaVision model) - `BambaConfig` configuration class: `BambaModel` (Bamba model) - `BarkConfig` configuration class: `BarkModel` (Bark model) - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [BartModel](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartModel) (BART model) - `BeitConfig` configuration class: `BeitModel` (BEiT model) - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [BertModel](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertModel) (BERT model) - `BertGenerationConfig` configuration class: `BertGenerationEncoder` (Bert Generation model) - `BigBirdConfig` configuration class: `BigBirdModel` (BigBird model) - `BigBirdPegasusConfig` configuration class: `BigBirdPegasusModel` (BigBird-Pegasus model) - [BioGptConfig](/docs/transformers/v4.57.1/ko/model_doc/biogpt#transformers.BioGptConfig) configuration class: [BioGptModel](/docs/transformers/v4.57.1/ko/model_doc/biogpt#transformers.BioGptModel) (BioGpt model) - `BitConfig` configuration class: `BitModel` (BiT model) - `BitNetConfig` configuration class: `BitNetModel` (BitNet model) - `BlenderbotConfig` configuration class: `BlenderbotModel` (Blenderbot model) - `BlenderbotSmallConfig` configuration class: `BlenderbotSmallModel` (BlenderbotSmall model) - [Blip2Config](/docs/transformers/v4.57.1/ko/model_doc/blip-2#transformers.Blip2Config) configuration class: [Blip2Model](/docs/transformers/v4.57.1/ko/model_doc/blip-2#transformers.Blip2Model) (BLIP-2 model) - [Blip2QFormerConfig](/docs/transformers/v4.57.1/ko/model_doc/blip-2#transformers.Blip2QFormerConfig) configuration class: [Blip2QFormerModel](/docs/transformers/v4.57.1/ko/model_doc/blip-2#transformers.Blip2QFormerModel) (BLIP-2 QFormer model) - [BlipConfig](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.BlipConfig) configuration class: [BlipModel](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.BlipModel) (BLIP model) - `BloomConfig` configuration class: `BloomModel` (BLOOM model) - `BltConfig` configuration class: `BltModel` (Blt model) - `BridgeTowerConfig` configuration class: `BridgeTowerModel` (BridgeTower model) - `BrosConfig` configuration class: `BrosModel` (BROS model) - [CLIPConfig](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPConfig) configuration class: [CLIPModel](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPModel) (CLIP model) - [CLIPSegConfig](/docs/transformers/v4.57.1/ko/model_doc/clipseg#transformers.CLIPSegConfig) configuration class: [CLIPSegModel](/docs/transformers/v4.57.1/ko/model_doc/clipseg#transformers.CLIPSegModel) (CLIPSeg model) - [CLIPTextConfig](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPTextConfig) configuration class: [CLIPTextModel](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPTextModel) (CLIPTextModel model) - [CLIPVisionConfig](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPVisionConfig) configuration class: [CLIPVisionModel](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPVisionModel) (CLIPVisionModel model) - `CTRLConfig` configuration class: `CTRLModel` (CTRL model) - `CamembertConfig` configuration class: `CamembertModel` (CamemBERT model) - `CanineConfig` configuration class: `CanineModel` (CANINE model) - [ChameleonConfig](/docs/transformers/v4.57.1/ko/model_doc/chameleon#transformers.ChameleonConfig) configuration class: [ChameleonModel](/docs/transformers/v4.57.1/ko/model_doc/chameleon#transformers.ChameleonModel) (Chameleon model) - `ChineseCLIPConfig` configuration class: `ChineseCLIPModel` (Chinese-CLIP model) - `ChineseCLIPVisionConfig` configuration class: `ChineseCLIPVisionModel` (ChineseCLIPVisionModel model) - `ClapConfig` configuration class: `ClapModel` (CLAP model) - `ClvpConfig` configuration class: `ClvpModelForConditionalGeneration` (CLVP model) - [CodeGenConfig](/docs/transformers/v4.57.1/ko/model_doc/codegen#transformers.CodeGenConfig) configuration class: [CodeGenModel](/docs/transformers/v4.57.1/ko/model_doc/codegen#transformers.CodeGenModel) (CodeGen model) - `Cohere2Config` configuration class: `Cohere2Model` (Cohere2 model) - `Cohere2VisionConfig` configuration class: `Cohere2VisionModel` (Cohere2Vision model) - [CohereConfig](/docs/transformers/v4.57.1/ko/model_doc/cohere#transformers.CohereConfig) configuration class: [CohereModel](/docs/transformers/v4.57.1/ko/model_doc/cohere#transformers.CohereModel) (Cohere model) - `ConditionalDetrConfig` configuration class: `ConditionalDetrModel` (Conditional DETR model) - [ConvBertConfig](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertModel](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertModel) (ConvBERT model) - `ConvNextConfig` configuration class: `ConvNextModel` (ConvNeXT model) - `ConvNextV2Config` configuration class: `ConvNextV2Model` (ConvNeXTV2 model) - `CpmAntConfig` configuration class: `CpmAntModel` (CPM-Ant model) - `CsmConfig` configuration class: `CsmForConditionalGeneration` (CSM model) - `CvtConfig` configuration class: `CvtModel` (CvT model) - `DFineConfig` configuration class: `DFineModel` (D-FINE model) - `DINOv3ConvNextConfig` configuration class: `DINOv3ConvNextModel` (DINOv3 ConvNext model) - `DINOv3ViTConfig` configuration class: `DINOv3ViTModel` (DINOv3 ViT model) - `DPRConfig` configuration class: `DPRQuestionEncoder` (DPR model) - `DPTConfig` configuration class: `DPTModel` (DPT model) - `DabDetrConfig` configuration class: `DabDetrModel` (DAB-DETR model) - `DacConfig` configuration class: `DacModel` (DAC model) - `Data2VecAudioConfig` configuration class: `Data2VecAudioModel` (Data2VecAudio model) - `Data2VecTextConfig` configuration class: `Data2VecTextModel` (Data2VecText model) - `Data2VecVisionConfig` configuration class: `Data2VecVisionModel` (Data2VecVision model) - [DbrxConfig](/docs/transformers/v4.57.1/ko/model_doc/dbrx#transformers.DbrxConfig) configuration class: [DbrxModel](/docs/transformers/v4.57.1/ko/model_doc/dbrx#transformers.DbrxModel) (DBRX model) - [DebertaConfig](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaModel](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaModel) (DeBERTa model) - [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2Model](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Model) (DeBERTa-v2 model) - `DecisionTransformerConfig` configuration class: `DecisionTransformerModel` (Decision Transformer model) - `DeepseekV2Config` configuration class: `DeepseekV2Model` (DeepSeek-V2 model) - [DeepseekV3Config](/docs/transformers/v4.57.1/ko/model_doc/deepseek_v3#transformers.DeepseekV3Config) configuration class: [DeepseekV3Model](/docs/transformers/v4.57.1/ko/model_doc/deepseek_v3#transformers.DeepseekV3Model) (DeepSeek-V3 model) - `DeepseekVLConfig` configuration class: `DeepseekVLModel` (DeepseekVL model) - `DeepseekVLHybridConfig` configuration class: `DeepseekVLHybridModel` (DeepseekVLHybrid model) - `DeformableDetrConfig` configuration class: `DeformableDetrModel` (Deformable DETR model) - `DeiTConfig` configuration class: `DeiTModel` (DeiT model) - `DepthProConfig` configuration class: `DepthProModel` (DepthPro model) - `DetaConfig` configuration class: `DetaModel` (DETA model) - `DetrConfig` configuration class: `DetrModel` (DETR model) - `DiaConfig` configuration class: `DiaModel` (Dia model) - `DiffLlamaConfig` configuration class: `DiffLlamaModel` (DiffLlama model) - `DinatConfig` configuration class: `DinatModel` (DiNAT model) - `Dinov2Config` configuration class: `Dinov2Model` (DINOv2 model) - `Dinov2WithRegistersConfig` configuration class: `Dinov2WithRegistersModel` (DINOv2 with Registers model) - `DistilBertConfig` configuration class: `DistilBertModel` (DistilBERT model) - `DogeConfig` configuration class: `DogeModel` (Doge model) - `DonutSwinConfig` configuration class: `DonutSwinModel` (DonutSwin model) - `Dots1Config` configuration class: `Dots1Model` (dots1 model) - `EdgeTamConfig` configuration class: `EdgeTamModel` (EdgeTAM model) - `EdgeTamVideoConfig` configuration class: `EdgeTamVideoModel` (EdgeTamVideo model) - `EdgeTamVisionConfig` configuration class: `EdgeTamVisionModel` (EdgeTamVisionModel model) - `EfficientFormerConfig` configuration class: `EfficientFormerModel` (EfficientFormer model) - `EfficientLoFTRConfig` configuration class: `EfficientLoFTRModel` (EfficientLoFTR model) - `EfficientNetConfig` configuration class: `EfficientNetModel` (EfficientNet model) - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraModel](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraModel) (ELECTRA model) - `Emu3Config` configuration class: `Emu3Model` (Emu3 model) - `EncodecConfig` configuration class: `EncodecModel` (EnCodec model) - `Ernie4_5Config` configuration class: `Ernie4_5Model` (Ernie4_5 model) - `Ernie4_5_MoeConfig` configuration class: `Ernie4_5_MoeModel` (Ernie4_5_MoE model) - `ErnieConfig` configuration class: `ErnieModel` (ERNIE model) - `ErnieMConfig` configuration class: `ErnieMModel` (ErnieM model) - [EsmConfig](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmConfig) configuration class: [EsmModel](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmModel) (ESM model) - `EvollaConfig` configuration class: `EvollaModel` (Evolla model) - [Exaone4Config](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4Config) configuration class: [Exaone4Model](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4Model) (EXAONE-4.0 model) - `FNetConfig` configuration class: `FNetModel` (FNet model) - `FSMTConfig` configuration class: `FSMTModel` (FairSeq Machine-Translation model) - `FalconConfig` configuration class: `FalconModel` (Falcon model) - `FalconH1Config` configuration class: `FalconH1Model` (FalconH1 model) - `FalconMambaConfig` configuration class: `FalconMambaModel` (FalconMamba model) - `FastSpeech2ConformerConfig` configuration class: `FastSpeech2ConformerModel` (FastSpeech2Conformer model) - `FastSpeech2ConformerWithHifiGanConfig` configuration class: `FastSpeech2ConformerWithHifiGan` (FastSpeech2ConformerWithHifiGan model) - `FlaubertConfig` configuration class: `FlaubertModel` (FlauBERT model) - `FlavaConfig` configuration class: `FlavaModel` (FLAVA model) - `FlexOlmoConfig` configuration class: `FlexOlmoModel` (FlexOlmo model) - `Florence2Config` configuration class: `Florence2Model` (Florence2 model) - `FocalNetConfig` configuration class: `FocalNetModel` (FocalNet model) - `FunnelConfig` configuration class: `FunnelModel` or `FunnelBaseModel` (Funnel Transformer model) - `FuyuConfig` configuration class: `FuyuModel` (Fuyu model) - `GLPNConfig` configuration class: `GLPNModel` (GLPN model) - [GPT2Config](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Config) configuration class: [GPT2Model](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Model) (OpenAI GPT-2 model) - `GPTBigCodeConfig` configuration class: `GPTBigCodeModel` (GPTBigCode model) - `GPTJConfig` configuration class: `GPTJModel` (GPT-J model) - `GPTNeoConfig` configuration class: `GPTNeoModel` (GPT Neo model) - `GPTNeoXConfig` configuration class: `GPTNeoXModel` (GPT NeoX model) - [GPTNeoXJapaneseConfig](/docs/transformers/v4.57.1/ko/model_doc/gpt_neox_japanese#transformers.GPTNeoXJapaneseConfig) configuration class: [GPTNeoXJapaneseModel](/docs/transformers/v4.57.1/ko/model_doc/gpt_neox_japanese#transformers.GPTNeoXJapaneseModel) (GPT NeoX Japanese model) - `GPTSanJapaneseConfig` configuration class: `GPTSanJapaneseForConditionalGeneration` (GPTSAN-japanese model) - [Gemma2Config](/docs/transformers/v4.57.1/ko/model_doc/gemma2#transformers.Gemma2Config) configuration class: [Gemma2Model](/docs/transformers/v4.57.1/ko/model_doc/gemma2#transformers.Gemma2Model) (Gemma2 model) - [Gemma3Config](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3Config) configuration class: [Gemma3Model](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3Model) (Gemma3ForConditionalGeneration model) - [Gemma3TextConfig](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3TextConfig) configuration class: [Gemma3TextModel](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3TextModel) (Gemma3ForCausalLM model) - `Gemma3nAudioConfig` configuration class: `Gemma3nAudioEncoder` (Gemma3nAudioEncoder model) - `Gemma3nConfig` configuration class: `Gemma3nModel` (Gemma3nForConditionalGeneration model) - `Gemma3nTextConfig` configuration class: `Gemma3nTextModel` (Gemma3nForCausalLM model) - `Gemma3nVisionConfig` configuration class: `TimmWrapperModel` (TimmWrapperModel model) - [GemmaConfig](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaConfig) configuration class: [GemmaModel](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaModel) (Gemma model) - `GitConfig` configuration class: `GitModel` (GIT model) - `Glm4Config` configuration class: `Glm4Model` (GLM4 model) - `Glm4MoeConfig` configuration class: `Glm4MoeModel` (Glm4MoE model) - `Glm4vConfig` configuration class: `Glm4vModel` (GLM4V model) - `Glm4vMoeConfig` configuration class: `Glm4vMoeModel` (GLM4VMOE model) - `Glm4vMoeTextConfig` configuration class: `Glm4vMoeTextModel` (GLM4VMOE model) - `Glm4vTextConfig` configuration class: `Glm4vTextModel` (GLM4V model) - `GlmConfig` configuration class: `GlmModel` (GLM model) - `GotOcr2Config` configuration class: `GotOcr2Model` (GOT-OCR2 model) - `GptOssConfig` configuration class: `GptOssModel` (GptOss model) - `GraniteConfig` configuration class: `GraniteModel` (Granite model) - `GraniteMoeConfig` configuration class: `GraniteMoeModel` (GraniteMoeMoe model) - `GraniteMoeHybridConfig` configuration class: `GraniteMoeHybridModel` (GraniteMoeHybrid model) - `GraniteMoeSharedConfig` configuration class: `GraniteMoeSharedModel` (GraniteMoeSharedMoe model) - [GraphormerConfig](/docs/transformers/v4.57.1/ko/model_doc/graphormer#transformers.GraphormerConfig) configuration class: [GraphormerModel](/docs/transformers/v4.57.1/ko/model_doc/graphormer#transformers.GraphormerModel) (Graphormer model) - [GroundingDinoConfig](/docs/transformers/v4.57.1/ko/model_doc/grounding-dino#transformers.GroundingDinoConfig) configuration class: [GroundingDinoModel](/docs/transformers/v4.57.1/ko/model_doc/grounding-dino#transformers.GroundingDinoModel) (Grounding DINO model) - `GroupViTConfig` configuration class: `GroupViTModel` (GroupViT model) - `HGNetV2Config` configuration class: `HGNetV2Backbone` (HGNet-V2 model) - `HeliumConfig` configuration class: `HeliumModel` (Helium model) - `HieraConfig` configuration class: `HieraModel` (Hiera model) - `HubertConfig` configuration class: `HubertModel` (Hubert model) - `HunYuanDenseV1Config` configuration class: `HunYuanDenseV1Model` (HunYuanDenseV1 model) - `HunYuanMoEV1Config` configuration class: `HunYuanMoEV1Model` (HunYuanMoeV1 model) - `IBertConfig` configuration class: `IBertModel` (I-BERT model) - `IJepaConfig` configuration class: `IJepaModel` (I-JEPA model) - `Idefics2Config` configuration class: `Idefics2Model` (Idefics2 model) - `Idefics3Config` configuration class: `Idefics3Model` (Idefics3 model) - `Idefics3VisionConfig` configuration class: `Idefics3VisionTransformer` (Idefics3VisionTransformer model) - `IdeficsConfig` configuration class: `IdeficsModel` (IDEFICS model) - `ImageGPTConfig` configuration class: `ImageGPTModel` (ImageGPT model) - [InformerConfig](/docs/transformers/v4.57.1/ko/model_doc/informer#transformers.InformerConfig) configuration class: [InformerModel](/docs/transformers/v4.57.1/ko/model_doc/informer#transformers.InformerModel) (Informer model) - `InstructBlipConfig` configuration class: `InstructBlipModel` (InstructBLIP model) - `InstructBlipVideoConfig` configuration class: `InstructBlipVideoModel` (InstructBlipVideo model) - `InternVLConfig` configuration class: `InternVLModel` (InternVL model) - `InternVLVisionConfig` configuration class: `InternVLVisionModel` (InternVLVision model) - [JambaConfig](/docs/transformers/v4.57.1/ko/model_doc/jamba#transformers.JambaConfig) configuration class: [JambaModel](/docs/transformers/v4.57.1/ko/model_doc/jamba#transformers.JambaModel) (Jamba model) - `JanusConfig` configuration class: `JanusModel` (Janus model) - `JetMoeConfig` configuration class: `JetMoeModel` (JetMoe model) - `JukeboxConfig` configuration class: `JukeboxModel` (Jukebox model) - `Kosmos2Config` configuration class: `Kosmos2Model` (KOSMOS-2 model) - `Kosmos2_5Config` configuration class: `Kosmos2_5Model` (KOSMOS-2.5 model) - `KyutaiSpeechToTextConfig` configuration class: `KyutaiSpeechToTextModel` (KyutaiSpeechToText model) - `LEDConfig` configuration class: `LEDModel` (LED model) - `LayoutLMConfig` configuration class: `LayoutLMModel` (LayoutLM model) - `LayoutLMv2Config` configuration class: `LayoutLMv2Model` (LayoutLMv2 model) - `LayoutLMv3Config` configuration class: `LayoutLMv3Model` (LayoutLMv3 model) - `LevitConfig` configuration class: `LevitModel` (LeViT model) - `Lfm2Config` configuration class: `Lfm2Model` (Lfm2 model) - `Lfm2VlConfig` configuration class: `Lfm2VlModel` (Lfm2Vl model) - `LightGlueConfig` configuration class: `LightGlueForKeypointMatching` (LightGlue model) - `LiltConfig` configuration class: `LiltModel` (LiLT model) - `Llama4Config` configuration class: `Llama4ForConditionalGeneration` (Llama4 model) - `Llama4TextConfig` configuration class: `Llama4TextModel` (Llama4ForCausalLM model) - [LlamaConfig](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaConfig) configuration class: [LlamaModel](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaModel) (LLaMA model) - `LlavaConfig` configuration class: `LlavaModel` (LLaVa model) - `LlavaNextConfig` configuration class: `LlavaNextModel` (LLaVA-NeXT model) - `LlavaNextVideoConfig` configuration class: `LlavaNextVideoModel` (LLaVa-NeXT-Video model) - `LlavaOnevisionConfig` configuration class: `LlavaOnevisionModel` (LLaVA-Onevision model) - `LongT5Config` configuration class: `LongT5Model` (LongT5 model) - `LongcatFlashConfig` configuration class: `LongcatFlashModel` (LongCatFlash model) - `LongformerConfig` configuration class: `LongformerModel` (Longformer model) - `LukeConfig` configuration class: `LukeModel` (LUKE model) - `LxmertConfig` configuration class: `LxmertModel` (LXMERT model) - `M2M100Config` configuration class: `M2M100Model` (M2M100 model) - `MBartConfig` configuration class: `MBartModel` (mBART model) - `MCTCTConfig` configuration class: `MCTCTModel` (M-CTC-T model) - `MLCDVisionConfig` configuration class: `MLCDVisionModel` (MLCD model) - `MMGroundingDinoConfig` configuration class: `MMGroundingDinoModel` (MM Grounding DINO model) - `MPNetConfig` configuration class: `MPNetModel` (MPNet model) - `MT5Config` configuration class: `MT5Model` (MT5 model) - [Mamba2Config](/docs/transformers/v4.57.1/ko/model_doc/mamba2#transformers.Mamba2Config) configuration class: [Mamba2Model](/docs/transformers/v4.57.1/ko/model_doc/mamba2#transformers.Mamba2Model) (mamba2 model) - [MambaConfig](/docs/transformers/v4.57.1/ko/model_doc/mamba#transformers.MambaConfig) configuration class: [MambaModel](/docs/transformers/v4.57.1/ko/model_doc/mamba#transformers.MambaModel) (Mamba model) - [MarianConfig](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.MarianConfig) configuration class: [MarianModel](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.MarianModel) (Marian model) - `MarkupLMConfig` configuration class: `MarkupLMModel` (MarkupLM model) - `Mask2FormerConfig` configuration class: `Mask2FormerModel` (Mask2Former model) - `MaskFormerConfig` configuration class: `MaskFormerModel` (MaskFormer model) - `MaskFormerSwinConfig` configuration class: `MaskFormerSwinModel` (MaskFormerSwin model) - `MegaConfig` configuration class: `MegaModel` (MEGA model) - `MegatronBertConfig` configuration class: `MegatronBertModel` (Megatron-BERT model) - `MetaClip2Config` configuration class: `MetaClip2Model` (MetaCLIP 2 model) - `MgpstrConfig` configuration class: `MgpstrForSceneTextRecognition` (MGP-STR model) - `MimiConfig` configuration class: `MimiModel` (Mimi model) - `MiniMaxConfig` configuration class: `MiniMaxModel` (MiniMax model) - `MinistralConfig` configuration class: `MinistralModel` (Ministral model) - `Mistral3Config` configuration class: `Mistral3Model` (Mistral3 model) - [MistralConfig](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralConfig) configuration class: [MistralModel](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralModel) (Mistral model) - `MixtralConfig` configuration class: `MixtralModel` (Mixtral model) - `MllamaConfig` configuration class: `MllamaModel` (Mllama model) - `MobileBertConfig` configuration class: `MobileBertModel` (MobileBERT model) - `MobileNetV1Config` configuration class: `MobileNetV1Model` (MobileNetV1 model) - `MobileNetV2Config` configuration class: `MobileNetV2Model` (MobileNetV2 model) - `MobileViTConfig` configuration class: `MobileViTModel` (MobileViT model) - `MobileViTV2Config` configuration class: `MobileViTV2Model` (MobileViTV2 model) - `ModernBertConfig` configuration class: `ModernBertModel` (ModernBERT model) - `ModernBertDecoderConfig` configuration class: `ModernBertDecoderModel` (ModernBertDecoder model) - `MoonshineConfig` configuration class: `MoonshineModel` (Moonshine model) - `MoshiConfig` configuration class: `MoshiModel` (Moshi model) - `MptConfig` configuration class: `MptModel` (MPT model) - `MraConfig` configuration class: `MraModel` (MRA model) - `MusicgenConfig` configuration class: `MusicgenModel` (MusicGen model) - `MusicgenMelodyConfig` configuration class: `MusicgenMelodyModel` (MusicGen Melody model) - `MvpConfig` configuration class: `MvpModel` (MVP model) - `NatConfig` configuration class: `NatModel` (NAT model) - `NemotronConfig` configuration class: `NemotronModel` (Nemotron model) - `NezhaConfig` configuration class: `NezhaModel` (Nezha model) - `NllbMoeConfig` configuration class: `NllbMoeModel` (NLLB-MOE model) - `NystromformerConfig` configuration class: `NystromformerModel` (Nyströmformer model) - `OPTConfig` configuration class: `OPTModel` (OPT model) - `Olmo2Config` configuration class: `Olmo2Model` (OLMo2 model) - `Olmo3Config` configuration class: `Olmo3Model` (Olmo3 model) - `OlmoConfig` configuration class: `OlmoModel` (OLMo model) - `OlmoeConfig` configuration class: `OlmoeModel` (OLMoE model) - `OmDetTurboConfig` configuration class: `OmDetTurboForObjectDetection` (OmDet-Turbo model) - `OneFormerConfig` configuration class: `OneFormerModel` (OneFormer model) - [OpenAIGPTConfig](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTConfig) configuration class: [OpenAIGPTModel](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTModel) (OpenAI GPT model) - `OpenLlamaConfig` configuration class: `OpenLlamaModel` (OpenLlama model) - `Ovis2Config` configuration class: `Ovis2Model` (Ovis2 model) - `OwlViTConfig` configuration class: `OwlViTModel` (OWL-ViT model) - `Owlv2Config` configuration class: `Owlv2Model` (OWLv2 model) - `PLBartConfig` configuration class: `PLBartModel` (PLBart model) - [PaliGemmaConfig](/docs/transformers/v4.57.1/ko/model_doc/paligemma#transformers.PaliGemmaConfig) configuration class: `PaliGemmaModel` (PaliGemma model) - `ParakeetCTCConfig` configuration class: `ParakeetForCTC` (Parakeet model) - `ParakeetEncoderConfig` configuration class: `ParakeetEncoder` (ParakeetEncoder model) - [PatchTSMixerConfig](/docs/transformers/v4.57.1/ko/model_doc/patchtsmixer#transformers.PatchTSMixerConfig) configuration class: [PatchTSMixerModel](/docs/transformers/v4.57.1/ko/model_doc/patchtsmixer#transformers.PatchTSMixerModel) (PatchTSMixer model) - [PatchTSTConfig](/docs/transformers/v4.57.1/ko/model_doc/patchtst#transformers.PatchTSTConfig) configuration class: [PatchTSTModel](/docs/transformers/v4.57.1/ko/model_doc/patchtst#transformers.PatchTSTModel) (PatchTST model) - `PegasusConfig` configuration class: `PegasusModel` (Pegasus model) - `PegasusXConfig` configuration class: `PegasusXModel` (PEGASUS-X model) - `PerceiverConfig` configuration class: `PerceiverModel` (Perceiver model) - `PerceptionLMConfig` configuration class: `PerceptionLMModel` (PerceptionLM model) - `PersimmonConfig` configuration class: `PersimmonModel` (Persimmon model) - `Phi3Config` configuration class: `Phi3Model` (Phi3 model) - `Phi4MultimodalConfig` configuration class: `Phi4MultimodalModel` (Phi4Multimodal model) - `PhiConfig` configuration class: `PhiModel` (Phi model) - `PhimoeConfig` configuration class: `PhimoeModel` (Phimoe model) - `PixtralVisionConfig` configuration class: `PixtralVisionModel` (Pixtral model) - `PoolFormerConfig` configuration class: `PoolFormerModel` (PoolFormer model) - `ProphetNetConfig` configuration class: `ProphetNetModel` (ProphetNet model) - `PvtConfig` configuration class: `PvtModel` (PVT model) - `PvtV2Config` configuration class: `PvtV2Model` (PVTv2 model) - `QDQBertConfig` configuration class: `QDQBertModel` (QDQBert model) - `Qwen2AudioEncoderConfig` configuration class: `Qwen2AudioEncoder` (Qwen2AudioEncoder model) - `Qwen2Config` configuration class: `Qwen2Model` (Qwen2 model) - `Qwen2MoeConfig` configuration class: `Qwen2MoeModel` (Qwen2MoE model) - [Qwen2VLConfig](/docs/transformers/v4.57.1/ko/model_doc/qwen2_vl#transformers.Qwen2VLConfig) configuration class: [Qwen2VLModel](/docs/transformers/v4.57.1/ko/model_doc/qwen2_vl#transformers.Qwen2VLModel) (Qwen2VL model) - `Qwen2VLTextConfig` configuration class: `Qwen2VLTextModel` (Qwen2VL model) - `Qwen2_5_VLConfig` configuration class: `Qwen2_5_VLModel` (Qwen2_5_VL model) - `Qwen2_5_VLTextConfig` configuration class: `Qwen2_5_VLTextModel` (Qwen2_5_VL model) - `Qwen3Config` configuration class: `Qwen3Model` (Qwen3 model) - `Qwen3MoeConfig` configuration class: `Qwen3MoeModel` (Qwen3MoE model) - `Qwen3NextConfig` configuration class: `Qwen3NextModel` (Qwen3Next model) - `Qwen3VLConfig` configuration class: `Qwen3VLModel` (Qwen3VL model) - `Qwen3VLMoeConfig` configuration class: `Qwen3VLMoeModel` (Qwen3VLMoe model) - `Qwen3VLMoeTextConfig` configuration class: `Qwen3VLMoeTextModel` (Qwen3VLMoe model) - `Qwen3VLTextConfig` configuration class: `Qwen3VLTextModel` (Qwen3VL model) - `RTDetrConfig` configuration class: `RTDetrModel` (RT-DETR model) - `RTDetrV2Config` configuration class: `RTDetrV2Model` (RT-DETRv2 model) - `RecurrentGemmaConfig` configuration class: `RecurrentGemmaModel` (RecurrentGemma model) - `ReformerConfig` configuration class: `ReformerModel` (Reformer model) - `RegNetConfig` configuration class: `RegNetModel` (RegNet model) - `RemBertConfig` configuration class: `RemBertModel` (RemBERT model) - `ResNetConfig` configuration class: `ResNetModel` (ResNet model) - `RetriBertConfig` configuration class: `RetriBertModel` (RetriBERT model) - `RoCBertConfig` configuration class: `RoCBertModel` (RoCBert model) - `RoFormerConfig` configuration class: `RoFormerModel` (RoFormer model) - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaModel](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaModel) (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormModel` (RoBERTa-PreLayerNorm model) - `RwkvConfig` configuration class: `RwkvModel` (RWKV model) - `SEWConfig` configuration class: `SEWModel` (SEW model) - `SEWDConfig` configuration class: `SEWDModel` (SEW-D model) - `Sam2Config` configuration class: `Sam2Model` (SAM2 model) - `Sam2HieraDetConfig` configuration class: `Sam2HieraDetModel` (Sam2HieraDetModel model) - `Sam2VideoConfig` configuration class: `Sam2VideoModel` (Sam2VideoModel model) - `Sam2VisionConfig` configuration class: `Sam2VisionModel` (Sam2VisionModel model) - `SamConfig` configuration class: `SamModel` (SAM model) - `SamHQConfig` configuration class: `SamHQModel` (SAM-HQ model) - `SamHQVisionConfig` configuration class: `SamHQVisionModel` (SamHQVisionModel model) - `SamVisionConfig` configuration class: `SamVisionModel` (SamVisionModel model) - `SeamlessM4TConfig` configuration class: `SeamlessM4TModel` (SeamlessM4T model) - `SeamlessM4Tv2Config` configuration class: `SeamlessM4Tv2Model` (SeamlessM4Tv2 model) - `SeedOssConfig` configuration class: `SeedOssModel` (SeedOss model) - `SegGptConfig` configuration class: `SegGptModel` (SegGPT model) - `SegformerConfig` configuration class: `SegformerModel` (SegFormer model) - `Siglip2Config` configuration class: `Siglip2Model` (SigLIP2 model) - `Siglip2VisionConfig` configuration class: `Siglip2VisionModel` (Siglip2VisionModel model) - [SiglipConfig](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipConfig) configuration class: [SiglipModel](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipModel) (SigLIP model) - [SiglipVisionConfig](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipVisionConfig) configuration class: [SiglipVisionModel](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipVisionModel) (SiglipVisionModel model) - `SmolLM3Config` configuration class: `SmolLM3Model` (SmolLM3 model) - [SmolVLMConfig](/docs/transformers/v4.57.1/ko/model_doc/smolvlm#transformers.SmolVLMConfig) configuration class: [SmolVLMModel](/docs/transformers/v4.57.1/ko/model_doc/smolvlm#transformers.SmolVLMModel) (SmolVLM model) - [SmolVLMVisionConfig](/docs/transformers/v4.57.1/ko/model_doc/smolvlm#transformers.SmolVLMVisionConfig) configuration class: [SmolVLMVisionTransformer](/docs/transformers/v4.57.1/ko/model_doc/smolvlm#transformers.SmolVLMVisionTransformer) (SmolVLMVisionTransformer model) - `Speech2TextConfig` configuration class: `Speech2TextModel` (Speech2Text model) - `SpeechT5Config` configuration class: `SpeechT5Model` (SpeechT5 model) - `SplinterConfig` configuration class: `SplinterModel` (Splinter model) - `SqueezeBertConfig` configuration class: `SqueezeBertModel` (SqueezeBERT model) - `StableLmConfig` configuration class: `StableLmModel` (StableLm model) - `Starcoder2Config` configuration class: `Starcoder2Model` (Starcoder2 model) - `SwiftFormerConfig` configuration class: `SwiftFormerModel` (SwiftFormer model) - [Swin2SRConfig](/docs/transformers/v4.57.1/ko/model_doc/swin2sr#transformers.Swin2SRConfig) configuration class: [Swin2SRModel](/docs/transformers/v4.57.1/ko/model_doc/swin2sr#transformers.Swin2SRModel) (Swin2SR model) - [SwinConfig](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.SwinConfig) configuration class: [SwinModel](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.SwinModel) (Swin Transformer model) - [Swinv2Config](/docs/transformers/v4.57.1/ko/model_doc/swinv2#transformers.Swinv2Config) configuration class: [Swinv2Model](/docs/transformers/v4.57.1/ko/model_doc/swinv2#transformers.Swinv2Model) (Swin Transformer V2 model) - `SwitchTransformersConfig` configuration class: `SwitchTransformersModel` (SwitchTransformers model) - `T5Config` configuration class: `T5Model` (T5 model) - `T5GemmaConfig` configuration class: `T5GemmaModel` (T5Gemma model) - `TableTransformerConfig` configuration class: `TableTransformerModel` (Table Transformer model) - `TapasConfig` configuration class: `TapasModel` (TAPAS model) - `TextNetConfig` configuration class: `TextNetModel` (TextNet model) - [TimeSeriesTransformerConfig](/docs/transformers/v4.57.1/ko/model_doc/time_series_transformer#transformers.TimeSeriesTransformerConfig) configuration class: [TimeSeriesTransformerModel](/docs/transformers/v4.57.1/ko/model_doc/time_series_transformer#transformers.TimeSeriesTransformerModel) (Time Series Transformer model) - `TimesFmConfig` configuration class: `TimesFmModel` (TimesFm model) - [TimesformerConfig](/docs/transformers/v4.57.1/ko/model_doc/timesformer#transformers.TimesformerConfig) configuration class: [TimesformerModel](/docs/transformers/v4.57.1/ko/model_doc/timesformer#transformers.TimesformerModel) (TimeSformer model) - `TimmBackboneConfig` configuration class: `TimmBackbone` (TimmBackbone model) - `TimmWrapperConfig` configuration class: `TimmWrapperModel` (TimmWrapperModel model) - [TrajectoryTransformerConfig](/docs/transformers/v4.57.1/ko/model_doc/trajectory_transformer#transformers.TrajectoryTransformerConfig) configuration class: [TrajectoryTransformerModel](/docs/transformers/v4.57.1/ko/model_doc/trajectory_transformer#transformers.TrajectoryTransformerModel) (Trajectory Transformer model) - `TransfoXLConfig` configuration class: `TransfoXLModel` (Transformer-XL model) - `TvltConfig` configuration class: `TvltModel` (TVLT model) - [TvpConfig](/docs/transformers/v4.57.1/ko/model_doc/tvp#transformers.TvpConfig) configuration class: [TvpModel](/docs/transformers/v4.57.1/ko/model_doc/tvp#transformers.TvpModel) (TVP model) - `UMT5Config` configuration class: `UMT5Model` (UMT5 model) - `UdopConfig` configuration class: `UdopModel` (UDOP model) - `UniSpeechConfig` configuration class: `UniSpeechModel` (UniSpeech model) - `UniSpeechSatConfig` configuration class: `UniSpeechSatModel` (UniSpeechSat model) - `UnivNetConfig` configuration class: `UnivNetModel` (UnivNet model) - `VJEPA2Config` configuration class: `VJEPA2Model` (VJEPA2Model model) - `VanConfig` configuration class: `VanModel` (VAN model) - `VaultGemmaConfig` configuration class: `VaultGemmaModel` (VaultGemma model) - [ViTConfig](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTConfig) configuration class: [ViTModel](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTModel) (ViT model) - `ViTHybridConfig` configuration class: `ViTHybridModel` (ViT Hybrid model) - `ViTMAEConfig` configuration class: `ViTMAEModel` (ViTMAE model) - `ViTMSNConfig` configuration class: `ViTMSNModel` (ViTMSN model) - `VideoLlavaConfig` configuration class: `VideoLlavaModel` (VideoLlava model) - `VideoMAEConfig` configuration class: `VideoMAEModel` (VideoMAE model) - `ViltConfig` configuration class: `ViltModel` (ViLT model) - `VipLlavaConfig` configuration class: `VipLlavaModel` (VipLlava model) - `VisionTextDualEncoderConfig` configuration class: `VisionTextDualEncoderModel` (VisionTextDualEncoder model) - `VisualBertConfig` configuration class: `VisualBertModel` (VisualBERT model) - `VitDetConfig` configuration class: `VitDetModel` (VitDet model) - `VitsConfig` configuration class: `VitsModel` (VITS model) - [VivitConfig](/docs/transformers/v4.57.1/ko/model_doc/vivit#transformers.VivitConfig) configuration class: [VivitModel](/docs/transformers/v4.57.1/ko/model_doc/vivit#transformers.VivitModel) (ViViT model) - `VoxtralConfig` configuration class: `VoxtralForConditionalGeneration` (Voxtral model) - `VoxtralEncoderConfig` configuration class: `VoxtralEncoder` (Voxtral Encoder model) - `Wav2Vec2BertConfig` configuration class: `Wav2Vec2BertModel` (Wav2Vec2-BERT model) - `Wav2Vec2Config` configuration class: `Wav2Vec2Model` (Wav2Vec2 model) - `Wav2Vec2ConformerConfig` configuration class: `Wav2Vec2ConformerModel` (Wav2Vec2-Conformer model) - `WavLMConfig` configuration class: `WavLMModel` (WavLM model) - [WhisperConfig](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperConfig) configuration class: [WhisperModel](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperModel) (Whisper model) - [XCLIPConfig](/docs/transformers/v4.57.1/ko/model_doc/xclip#transformers.XCLIPConfig) configuration class: [XCLIPModel](/docs/transformers/v4.57.1/ko/model_doc/xclip#transformers.XCLIPModel) (X-CLIP model) - `XGLMConfig` configuration class: `XGLMModel` (XGLM model) - `XLMConfig` configuration class: `XLMModel` (XLM model) - `XLMProphetNetConfig` configuration class: `XLMProphetNetModel` (XLM-ProphetNet model) - `XLMRobertaConfig` configuration class: `XLMRobertaModel` (XLM-RoBERTa model) - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLModel` (XLM-RoBERTa-XL model) - `XLNetConfig` configuration class: `XLNetModel` (XLNet model) - `XcodecConfig` configuration class: `XcodecModel` (X-CODEC model) - `XmodConfig` configuration class: `XmodModel` (X-MOD model) - `YolosConfig` configuration class: `YolosModel` (YOLOS model) - `YosoConfig` configuration class: `YosoModel` (YOSO model) - `Zamba2Config` configuration class: `Zamba2Model` (Zamba2 model) - `ZambaConfig` configuration class: `ZambaModel` (Zamba model) - `xLSTMConfig` configuration class: `xLSTMModel` (xLSTM model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModel.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the base model classes of the library from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **aimv2** -- `Aimv2Model` (AIMv2 model)
- **aimv2_vision_model** -- `Aimv2VisionModel` (Aimv2VisionModel model)
- **albert** -- `AlbertModel` (ALBERT model)
- **align** -- `AlignModel` (ALIGN model)
- **altclip** -- [AltCLIPModel](/docs/transformers/v4.57.1/ko/model_doc/altclip#transformers.AltCLIPModel) (AltCLIP model)
- **apertus** -- `ApertusModel` (Apertus model)
- **arcee** -- `ArceeModel` (Arcee model)
- **aria** -- `AriaModel` (Aria model)
- **aria_text** -- `AriaTextModel` (AriaText model)
- **audio-spectrogram-transformer** -- `ASTModel` (Audio Spectrogram Transformer model)
- **autoformer** -- [AutoformerModel](/docs/transformers/v4.57.1/ko/model_doc/autoformer#transformers.AutoformerModel) (Autoformer model)
- **aya_vision** -- `AyaVisionModel` (AyaVision model)
- **bamba** -- `BambaModel` (Bamba model)
- **bark** -- `BarkModel` (Bark model)
- **bart** -- [BartModel](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartModel) (BART model)
- **beit** -- `BeitModel` (BEiT model)
- **bert** -- [BertModel](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertModel) (BERT model)
- **bert-generation** -- `BertGenerationEncoder` (Bert Generation model)
- **big_bird** -- `BigBirdModel` (BigBird model)
- **bigbird_pegasus** -- `BigBirdPegasusModel` (BigBird-Pegasus model)
- **biogpt** -- [BioGptModel](/docs/transformers/v4.57.1/ko/model_doc/biogpt#transformers.BioGptModel) (BioGpt model)
- **bit** -- `BitModel` (BiT model)
- **bitnet** -- `BitNetModel` (BitNet model)
- **blenderbot** -- `BlenderbotModel` (Blenderbot model)
- **blenderbot-small** -- `BlenderbotSmallModel` (BlenderbotSmall model)
- **blip** -- [BlipModel](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.BlipModel) (BLIP model)
- **blip-2** -- [Blip2Model](/docs/transformers/v4.57.1/ko/model_doc/blip-2#transformers.Blip2Model) (BLIP-2 model)
- **blip_2_qformer** -- [Blip2QFormerModel](/docs/transformers/v4.57.1/ko/model_doc/blip-2#transformers.Blip2QFormerModel) (BLIP-2 QFormer model)
- **bloom** -- `BloomModel` (BLOOM model)
- **blt** -- `BltModel` (Blt model)
- **bridgetower** -- `BridgeTowerModel` (BridgeTower model)
- **bros** -- `BrosModel` (BROS model)
- **camembert** -- `CamembertModel` (CamemBERT model)
- **canine** -- `CanineModel` (CANINE model)
- **chameleon** -- [ChameleonModel](/docs/transformers/v4.57.1/ko/model_doc/chameleon#transformers.ChameleonModel) (Chameleon model)
- **chinese_clip** -- `ChineseCLIPModel` (Chinese-CLIP model)
- **chinese_clip_vision_model** -- `ChineseCLIPVisionModel` (ChineseCLIPVisionModel model)
- **clap** -- `ClapModel` (CLAP model)
- **clip** -- [CLIPModel](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPModel) (CLIP model)
- **clip_text_model** -- [CLIPTextModel](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPTextModel) (CLIPTextModel model)
- **clip_vision_model** -- [CLIPVisionModel](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPVisionModel) (CLIPVisionModel model)
- **clipseg** -- [CLIPSegModel](/docs/transformers/v4.57.1/ko/model_doc/clipseg#transformers.CLIPSegModel) (CLIPSeg model)
- **clvp** -- `ClvpModelForConditionalGeneration` (CLVP model)
- **code_llama** -- [LlamaModel](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaModel) (CodeLlama model)
- **codegen** -- [CodeGenModel](/docs/transformers/v4.57.1/ko/model_doc/codegen#transformers.CodeGenModel) (CodeGen model)
- **cohere** -- [CohereModel](/docs/transformers/v4.57.1/ko/model_doc/cohere#transformers.CohereModel) (Cohere model)
- **cohere2** -- `Cohere2Model` (Cohere2 model)
- **cohere2_vision** -- `Cohere2VisionModel` (Cohere2Vision model)
- **conditional_detr** -- `ConditionalDetrModel` (Conditional DETR model)
- **convbert** -- [ConvBertModel](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertModel) (ConvBERT model)
- **convnext** -- `ConvNextModel` (ConvNeXT model)
- **convnextv2** -- `ConvNextV2Model` (ConvNeXTV2 model)
- **cpmant** -- `CpmAntModel` (CPM-Ant model)
- **csm** -- `CsmForConditionalGeneration` (CSM model)
- **ctrl** -- `CTRLModel` (CTRL model)
- **cvt** -- `CvtModel` (CvT model)
- **d_fine** -- `DFineModel` (D-FINE model)
- **dab-detr** -- `DabDetrModel` (DAB-DETR model)
- **dac** -- `DacModel` (DAC model)
- **data2vec-audio** -- `Data2VecAudioModel` (Data2VecAudio model)
- **data2vec-text** -- `Data2VecTextModel` (Data2VecText model)
- **data2vec-vision** -- `Data2VecVisionModel` (Data2VecVision model)
- **dbrx** -- [DbrxModel](/docs/transformers/v4.57.1/ko/model_doc/dbrx#transformers.DbrxModel) (DBRX model)
- **deberta** -- [DebertaModel](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaModel) (DeBERTa model)
- **deberta-v2** -- [DebertaV2Model](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Model) (DeBERTa-v2 model)
- **decision_transformer** -- `DecisionTransformerModel` (Decision Transformer model)
- **deepseek_v2** -- `DeepseekV2Model` (DeepSeek-V2 model)
- **deepseek_v3** -- [DeepseekV3Model](/docs/transformers/v4.57.1/ko/model_doc/deepseek_v3#transformers.DeepseekV3Model) (DeepSeek-V3 model)
- **deepseek_vl** -- `DeepseekVLModel` (DeepseekVL model)
- **deepseek_vl_hybrid** -- `DeepseekVLHybridModel` (DeepseekVLHybrid model)
- **deformable_detr** -- `DeformableDetrModel` (Deformable DETR model)
- **deit** -- `DeiTModel` (DeiT model)
- **depth_pro** -- `DepthProModel` (DepthPro model)
- **deta** -- `DetaModel` (DETA model)
- **detr** -- `DetrModel` (DETR model)
- **dia** -- `DiaModel` (Dia model)
- **diffllama** -- `DiffLlamaModel` (DiffLlama model)
- **dinat** -- `DinatModel` (DiNAT model)
- **dinov2** -- `Dinov2Model` (DINOv2 model)
- **dinov2_with_registers** -- `Dinov2WithRegistersModel` (DINOv2 with Registers model)
- **dinov3_convnext** -- `DINOv3ConvNextModel` (DINOv3 ConvNext model)
- **dinov3_vit** -- `DINOv3ViTModel` (DINOv3 ViT model)
- **distilbert** -- `DistilBertModel` (DistilBERT model)
- **doge** -- `DogeModel` (Doge model)
- **donut-swin** -- `DonutSwinModel` (DonutSwin model)
- **dots1** -- `Dots1Model` (dots1 model)
- **dpr** -- `DPRQuestionEncoder` (DPR model)
- **dpt** -- `DPTModel` (DPT model)
- **edgetam** -- `EdgeTamModel` (EdgeTAM model)
- **edgetam_video** -- `EdgeTamVideoModel` (EdgeTamVideo model)
- **edgetam_vision_model** -- `EdgeTamVisionModel` (EdgeTamVisionModel model)
- **efficientformer** -- `EfficientFormerModel` (EfficientFormer model)
- **efficientloftr** -- `EfficientLoFTRModel` (EfficientLoFTR model)
- **efficientnet** -- `EfficientNetModel` (EfficientNet model)
- **electra** -- [ElectraModel](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraModel) (ELECTRA model)
- **emu3** -- `Emu3Model` (Emu3 model)
- **encodec** -- `EncodecModel` (EnCodec model)
- **ernie** -- `ErnieModel` (ERNIE model)
- **ernie4_5** -- `Ernie4_5Model` (Ernie4_5 model)
- **ernie4_5_moe** -- `Ernie4_5_MoeModel` (Ernie4_5_MoE model)
- **ernie_m** -- `ErnieMModel` (ErnieM model)
- **esm** -- [EsmModel](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmModel) (ESM model)
- **evolla** -- `EvollaModel` (Evolla model)
- **exaone4** -- [Exaone4Model](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4Model) (EXAONE-4.0 model)
- **falcon** -- `FalconModel` (Falcon model)
- **falcon_h1** -- `FalconH1Model` (FalconH1 model)
- **falcon_mamba** -- `FalconMambaModel` (FalconMamba model)
- **fastspeech2_conformer** -- `FastSpeech2ConformerModel` (FastSpeech2Conformer model)
- **fastspeech2_conformer_with_hifigan** -- `FastSpeech2ConformerWithHifiGan` (FastSpeech2ConformerWithHifiGan model)
- **flaubert** -- `FlaubertModel` (FlauBERT model)
- **flava** -- `FlavaModel` (FLAVA model)
- **flex_olmo** -- `FlexOlmoModel` (FlexOlmo model)
- **florence2** -- `Florence2Model` (Florence2 model)
- **fnet** -- `FNetModel` (FNet model)
- **focalnet** -- `FocalNetModel` (FocalNet model)
- **fsmt** -- `FSMTModel` (FairSeq Machine-Translation model)
- **funnel** -- `FunnelModel` or `FunnelBaseModel` (Funnel Transformer model)
- **fuyu** -- `FuyuModel` (Fuyu model)
- **gemma** -- [GemmaModel](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaModel) (Gemma model)
- **gemma2** -- [Gemma2Model](/docs/transformers/v4.57.1/ko/model_doc/gemma2#transformers.Gemma2Model) (Gemma2 model)
- **gemma3** -- [Gemma3Model](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3Model) (Gemma3ForConditionalGeneration model)
- **gemma3_text** -- [Gemma3TextModel](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3TextModel) (Gemma3ForCausalLM model)
- **gemma3n** -- `Gemma3nModel` (Gemma3nForConditionalGeneration model)
- **gemma3n_audio** -- `Gemma3nAudioEncoder` (Gemma3nAudioEncoder model)
- **gemma3n_text** -- `Gemma3nTextModel` (Gemma3nForCausalLM model)
- **gemma3n_vision** -- `TimmWrapperModel` (TimmWrapperModel model)
- **git** -- `GitModel` (GIT model)
- **glm** -- `GlmModel` (GLM model)
- **glm4** -- `Glm4Model` (GLM4 model)
- **glm4_moe** -- `Glm4MoeModel` (Glm4MoE model)
- **glm4v** -- `Glm4vModel` (GLM4V model)
- **glm4v_moe** -- `Glm4vMoeModel` (GLM4VMOE model)
- **glm4v_moe_text** -- `Glm4vMoeTextModel` (GLM4VMOE model)
- **glm4v_text** -- `Glm4vTextModel` (GLM4V model)
- **glpn** -- `GLPNModel` (GLPN model)
- **got_ocr2** -- `GotOcr2Model` (GOT-OCR2 model)
- **gpt-sw3** -- [GPT2Model](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Model) (GPT-Sw3 model)
- **gpt2** -- [GPT2Model](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Model) (OpenAI GPT-2 model)
- **gpt_bigcode** -- `GPTBigCodeModel` (GPTBigCode model)
- **gpt_neo** -- `GPTNeoModel` (GPT Neo model)
- **gpt_neox** -- `GPTNeoXModel` (GPT NeoX model)
- **gpt_neox_japanese** -- [GPTNeoXJapaneseModel](/docs/transformers/v4.57.1/ko/model_doc/gpt_neox_japanese#transformers.GPTNeoXJapaneseModel) (GPT NeoX Japanese model)
- **gpt_oss** -- `GptOssModel` (GptOss model)
- **gptj** -- `GPTJModel` (GPT-J model)
- **gptsan-japanese** -- `GPTSanJapaneseForConditionalGeneration` (GPTSAN-japanese model)
- **granite** -- `GraniteModel` (Granite model)
- **granitemoe** -- `GraniteMoeModel` (GraniteMoeMoe model)
- **granitemoehybrid** -- `GraniteMoeHybridModel` (GraniteMoeHybrid model)
- **granitemoeshared** -- `GraniteMoeSharedModel` (GraniteMoeSharedMoe model)
- **graphormer** -- [GraphormerModel](/docs/transformers/v4.57.1/ko/model_doc/graphormer#transformers.GraphormerModel) (Graphormer model)
- **grounding-dino** -- [GroundingDinoModel](/docs/transformers/v4.57.1/ko/model_doc/grounding-dino#transformers.GroundingDinoModel) (Grounding DINO model)
- **groupvit** -- `GroupViTModel` (GroupViT model)
- **helium** -- `HeliumModel` (Helium model)
- **hgnet_v2** -- `HGNetV2Backbone` (HGNet-V2 model)
- **hiera** -- `HieraModel` (Hiera model)
- **hubert** -- `HubertModel` (Hubert model)
- **hunyuan_v1_dense** -- `HunYuanDenseV1Model` (HunYuanDenseV1 model)
- **hunyuan_v1_moe** -- `HunYuanMoEV1Model` (HunYuanMoeV1 model)
- **ibert** -- `IBertModel` (I-BERT model)
- **idefics** -- `IdeficsModel` (IDEFICS model)
- **idefics2** -- `Idefics2Model` (Idefics2 model)
- **idefics3** -- `Idefics3Model` (Idefics3 model)
- **idefics3_vision** -- `Idefics3VisionTransformer` (Idefics3VisionTransformer model)
- **ijepa** -- `IJepaModel` (I-JEPA model)
- **imagegpt** -- `ImageGPTModel` (ImageGPT model)
- **informer** -- [InformerModel](/docs/transformers/v4.57.1/ko/model_doc/informer#transformers.InformerModel) (Informer model)
- **instructblip** -- `InstructBlipModel` (InstructBLIP model)
- **instructblipvideo** -- `InstructBlipVideoModel` (InstructBlipVideo model)
- **internvl** -- `InternVLModel` (InternVL model)
- **internvl_vision** -- `InternVLVisionModel` (InternVLVision model)
- **jamba** -- [JambaModel](/docs/transformers/v4.57.1/ko/model_doc/jamba#transformers.JambaModel) (Jamba model)
- **janus** -- `JanusModel` (Janus model)
- **jetmoe** -- `JetMoeModel` (JetMoe model)
- **jukebox** -- `JukeboxModel` (Jukebox model)
- **kosmos-2** -- `Kosmos2Model` (KOSMOS-2 model)
- **kosmos-2.5** -- `Kosmos2_5Model` (KOSMOS-2.5 model)
- **kyutai_speech_to_text** -- `KyutaiSpeechToTextModel` (KyutaiSpeechToText model)
- **layoutlm** -- `LayoutLMModel` (LayoutLM model)
- **layoutlmv2** -- `LayoutLMv2Model` (LayoutLMv2 model)
- **layoutlmv3** -- `LayoutLMv3Model` (LayoutLMv3 model)
- **led** -- `LEDModel` (LED model)
- **levit** -- `LevitModel` (LeViT model)
- **lfm2** -- `Lfm2Model` (Lfm2 model)
- **lfm2_vl** -- `Lfm2VlModel` (Lfm2Vl model)
- **lightglue** -- `LightGlueForKeypointMatching` (LightGlue model)
- **lilt** -- `LiltModel` (LiLT model)
- **llama** -- [LlamaModel](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaModel) (LLaMA model)
- **llama4** -- `Llama4ForConditionalGeneration` (Llama4 model)
- **llama4_text** -- `Llama4TextModel` (Llama4ForCausalLM model)
- **llava** -- `LlavaModel` (LLaVa model)
- **llava_next** -- `LlavaNextModel` (LLaVA-NeXT model)
- **llava_next_video** -- `LlavaNextVideoModel` (LLaVa-NeXT-Video model)
- **llava_onevision** -- `LlavaOnevisionModel` (LLaVA-Onevision model)
- **longcat_flash** -- `LongcatFlashModel` (LongCatFlash model)
- **longformer** -- `LongformerModel` (Longformer model)
- **longt5** -- `LongT5Model` (LongT5 model)
- **luke** -- `LukeModel` (LUKE model)
- **lxmert** -- `LxmertModel` (LXMERT model)
- **m2m_100** -- `M2M100Model` (M2M100 model)
- **mamba** -- [MambaModel](/docs/transformers/v4.57.1/ko/model_doc/mamba#transformers.MambaModel) (Mamba model)
- **mamba2** -- [Mamba2Model](/docs/transformers/v4.57.1/ko/model_doc/mamba2#transformers.Mamba2Model) (mamba2 model)
- **marian** -- [MarianModel](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.MarianModel) (Marian model)
- **markuplm** -- `MarkupLMModel` (MarkupLM model)
- **mask2former** -- `Mask2FormerModel` (Mask2Former model)
- **maskformer** -- `MaskFormerModel` (MaskFormer model)
- **maskformer-swin** -- `MaskFormerSwinModel` (MaskFormerSwin model)
- **mbart** -- `MBartModel` (mBART model)
- **mctct** -- `MCTCTModel` (M-CTC-T model)
- **mega** -- `MegaModel` (MEGA model)
- **megatron-bert** -- `MegatronBertModel` (Megatron-BERT model)
- **metaclip_2** -- `MetaClip2Model` (MetaCLIP 2 model)
- **mgp-str** -- `MgpstrForSceneTextRecognition` (MGP-STR model)
- **mimi** -- `MimiModel` (Mimi model)
- **minimax** -- `MiniMaxModel` (MiniMax model)
- **ministral** -- `MinistralModel` (Ministral model)
- **mistral** -- [MistralModel](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralModel) (Mistral model)
- **mistral3** -- `Mistral3Model` (Mistral3 model)
- **mixtral** -- `MixtralModel` (Mixtral model)
- **mlcd** -- `MLCDVisionModel` (MLCD model)
- **mllama** -- `MllamaModel` (Mllama model)
- **mm-grounding-dino** -- `MMGroundingDinoModel` (MM Grounding DINO model)
- **mobilebert** -- `MobileBertModel` (MobileBERT model)
- **mobilenet_v1** -- `MobileNetV1Model` (MobileNetV1 model)
- **mobilenet_v2** -- `MobileNetV2Model` (MobileNetV2 model)
- **mobilevit** -- `MobileViTModel` (MobileViT model)
- **mobilevitv2** -- `MobileViTV2Model` (MobileViTV2 model)
- **modernbert** -- `ModernBertModel` (ModernBERT model)
- **modernbert-decoder** -- `ModernBertDecoderModel` (ModernBertDecoder model)
- **moonshine** -- `MoonshineModel` (Moonshine model)
- **moshi** -- `MoshiModel` (Moshi model)
- **mpnet** -- `MPNetModel` (MPNet model)
- **mpt** -- `MptModel` (MPT model)
- **mra** -- `MraModel` (MRA model)
- **mt5** -- `MT5Model` (MT5 model)
- **musicgen** -- `MusicgenModel` (MusicGen model)
- **musicgen_melody** -- `MusicgenMelodyModel` (MusicGen Melody model)
- **mvp** -- `MvpModel` (MVP model)
- **nat** -- `NatModel` (NAT model)
- **nemotron** -- `NemotronModel` (Nemotron model)
- **nezha** -- `NezhaModel` (Nezha model)
- **nllb-moe** -- `NllbMoeModel` (NLLB-MOE model)
- **nystromformer** -- `NystromformerModel` (Nyströmformer model)
- **olmo** -- `OlmoModel` (OLMo model)
- **olmo2** -- `Olmo2Model` (OLMo2 model)
- **olmo3** -- `Olmo3Model` (Olmo3 model)
- **olmoe** -- `OlmoeModel` (OLMoE model)
- **omdet-turbo** -- `OmDetTurboForObjectDetection` (OmDet-Turbo model)
- **oneformer** -- `OneFormerModel` (OneFormer model)
- **open-llama** -- `OpenLlamaModel` (OpenLlama model)
- **openai-gpt** -- [OpenAIGPTModel](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTModel) (OpenAI GPT model)
- **opt** -- `OPTModel` (OPT model)
- **ovis2** -- `Ovis2Model` (Ovis2 model)
- **owlv2** -- `Owlv2Model` (OWLv2 model)
- **owlvit** -- `OwlViTModel` (OWL-ViT model)
- **paligemma** -- `PaliGemmaModel` (PaliGemma model)
- **parakeet_ctc** -- `ParakeetForCTC` (Parakeet model)
- **parakeet_encoder** -- `ParakeetEncoder` (ParakeetEncoder model)
- **patchtsmixer** -- [PatchTSMixerModel](/docs/transformers/v4.57.1/ko/model_doc/patchtsmixer#transformers.PatchTSMixerModel) (PatchTSMixer model)
- **patchtst** -- [PatchTSTModel](/docs/transformers/v4.57.1/ko/model_doc/patchtst#transformers.PatchTSTModel) (PatchTST model)
- **pegasus** -- `PegasusModel` (Pegasus model)
- **pegasus_x** -- `PegasusXModel` (PEGASUS-X model)
- **perceiver** -- `PerceiverModel` (Perceiver model)
- **perception_encoder** -- `PerceptionEncoder` (PerceptionEncoder model)
- **perception_lm** -- `PerceptionLMModel` (PerceptionLM model)
- **persimmon** -- `PersimmonModel` (Persimmon model)
- **phi** -- `PhiModel` (Phi model)
- **phi3** -- `Phi3Model` (Phi3 model)
- **phi4_multimodal** -- `Phi4MultimodalModel` (Phi4Multimodal model)
- **phimoe** -- `PhimoeModel` (Phimoe model)
- **pixtral** -- `PixtralVisionModel` (Pixtral model)
- **plbart** -- `PLBartModel` (PLBart model)
- **poolformer** -- `PoolFormerModel` (PoolFormer model)
- **prophetnet** -- `ProphetNetModel` (ProphetNet model)
- **pvt** -- `PvtModel` (PVT model)
- **pvt_v2** -- `PvtV2Model` (PVTv2 model)
- **qdqbert** -- `QDQBertModel` (QDQBert model)
- **qwen2** -- `Qwen2Model` (Qwen2 model)
- **qwen2_5_vl** -- `Qwen2_5_VLModel` (Qwen2_5_VL model)
- **qwen2_5_vl_text** -- `Qwen2_5_VLTextModel` (Qwen2_5_VL model)
- **qwen2_audio_encoder** -- `Qwen2AudioEncoder` (Qwen2AudioEncoder model)
- **qwen2_moe** -- `Qwen2MoeModel` (Qwen2MoE model)
- **qwen2_vl** -- [Qwen2VLModel](/docs/transformers/v4.57.1/ko/model_doc/qwen2_vl#transformers.Qwen2VLModel) (Qwen2VL model)
- **qwen2_vl_text** -- `Qwen2VLTextModel` (Qwen2VL model)
- **qwen3** -- `Qwen3Model` (Qwen3 model)
- **qwen3_moe** -- `Qwen3MoeModel` (Qwen3MoE model)
- **qwen3_next** -- `Qwen3NextModel` (Qwen3Next model)
- **qwen3_vl** -- `Qwen3VLModel` (Qwen3VL model)
- **qwen3_vl_moe** -- `Qwen3VLMoeModel` (Qwen3VLMoe model)
- **qwen3_vl_moe_text** -- `Qwen3VLMoeTextModel` (Qwen3VLMoe model)
- **qwen3_vl_text** -- `Qwen3VLTextModel` (Qwen3VL model)
- **recurrent_gemma** -- `RecurrentGemmaModel` (RecurrentGemma model)
- **reformer** -- `ReformerModel` (Reformer model)
- **regnet** -- `RegNetModel` (RegNet model)
- **rembert** -- `RemBertModel` (RemBERT model)
- **resnet** -- `ResNetModel` (ResNet model)
- **retribert** -- `RetriBertModel` (RetriBERT model)
- **roberta** -- [RobertaModel](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaModel) (RoBERTa model)
- **roberta-prelayernorm** -- `RobertaPreLayerNormModel` (RoBERTa-PreLayerNorm model)
- **roc_bert** -- `RoCBertModel` (RoCBert model)
- **roformer** -- `RoFormerModel` (RoFormer model)
- **rt_detr** -- `RTDetrModel` (RT-DETR model)
- **rt_detr_v2** -- `RTDetrV2Model` (RT-DETRv2 model)
- **rwkv** -- `RwkvModel` (RWKV model)
- **sam** -- `SamModel` (SAM model)
- **sam2** -- `Sam2Model` (SAM2 model)
- **sam2_hiera_det_model** -- `Sam2HieraDetModel` (Sam2HieraDetModel model)
- **sam2_video** -- `Sam2VideoModel` (Sam2VideoModel model)
- **sam2_vision_model** -- `Sam2VisionModel` (Sam2VisionModel model)
- **sam_hq** -- `SamHQModel` (SAM-HQ model)
- **sam_hq_vision_model** -- `SamHQVisionModel` (SamHQVisionModel model)
- **sam_vision_model** -- `SamVisionModel` (SamVisionModel model)
- **seamless_m4t** -- `SeamlessM4TModel` (SeamlessM4T model)
- **seamless_m4t_v2** -- `SeamlessM4Tv2Model` (SeamlessM4Tv2 model)
- **seed_oss** -- `SeedOssModel` (SeedOss model)
- **segformer** -- `SegformerModel` (SegFormer model)
- **seggpt** -- `SegGptModel` (SegGPT model)
- **sew** -- `SEWModel` (SEW model)
- **sew-d** -- `SEWDModel` (SEW-D model)
- **siglip** -- [SiglipModel](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipModel) (SigLIP model)
- **siglip2** -- `Siglip2Model` (SigLIP2 model)
- **siglip2_vision_model** -- `Siglip2VisionModel` (Siglip2VisionModel model)
- **siglip_vision_model** -- [SiglipVisionModel](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipVisionModel) (SiglipVisionModel model)
- **smollm3** -- `SmolLM3Model` (SmolLM3 model)
- **smolvlm** -- [SmolVLMModel](/docs/transformers/v4.57.1/ko/model_doc/smolvlm#transformers.SmolVLMModel) (SmolVLM model)
- **smolvlm_vision** -- [SmolVLMVisionTransformer](/docs/transformers/v4.57.1/ko/model_doc/smolvlm#transformers.SmolVLMVisionTransformer) (SmolVLMVisionTransformer model)
- **speech_to_text** -- `Speech2TextModel` (Speech2Text model)
- **speecht5** -- `SpeechT5Model` (SpeechT5 model)
- **splinter** -- `SplinterModel` (Splinter model)
- **squeezebert** -- `SqueezeBertModel` (SqueezeBERT model)
- **stablelm** -- `StableLmModel` (StableLm model)
- **starcoder2** -- `Starcoder2Model` (Starcoder2 model)
- **swiftformer** -- `SwiftFormerModel` (SwiftFormer model)
- **swin** -- [SwinModel](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.SwinModel) (Swin Transformer model)
- **swin2sr** -- [Swin2SRModel](/docs/transformers/v4.57.1/ko/model_doc/swin2sr#transformers.Swin2SRModel) (Swin2SR model)
- **swinv2** -- [Swinv2Model](/docs/transformers/v4.57.1/ko/model_doc/swinv2#transformers.Swinv2Model) (Swin Transformer V2 model)
- **switch_transformers** -- `SwitchTransformersModel` (SwitchTransformers model)
- **t5** -- `T5Model` (T5 model)
- **t5gemma** -- `T5GemmaModel` (T5Gemma model)
- **table-transformer** -- `TableTransformerModel` (Table Transformer model)
- **tapas** -- `TapasModel` (TAPAS model)
- **textnet** -- `TextNetModel` (TextNet model)
- **time_series_transformer** -- [TimeSeriesTransformerModel](/docs/transformers/v4.57.1/ko/model_doc/time_series_transformer#transformers.TimeSeriesTransformerModel) (Time Series Transformer model)
- **timesfm** -- `TimesFmModel` (TimesFm model)
- **timesformer** -- [TimesformerModel](/docs/transformers/v4.57.1/ko/model_doc/timesformer#transformers.TimesformerModel) (TimeSformer model)
- **timm_backbone** -- `TimmBackbone` (TimmBackbone model)
- **timm_wrapper** -- `TimmWrapperModel` (TimmWrapperModel model)
- **trajectory_transformer** -- [TrajectoryTransformerModel](/docs/transformers/v4.57.1/ko/model_doc/trajectory_transformer#transformers.TrajectoryTransformerModel) (Trajectory Transformer model)
- **transfo-xl** -- `TransfoXLModel` (Transformer-XL model)
- **tvlt** -- `TvltModel` (TVLT model)
- **tvp** -- [TvpModel](/docs/transformers/v4.57.1/ko/model_doc/tvp#transformers.TvpModel) (TVP model)
- **udop** -- `UdopModel` (UDOP model)
- **umt5** -- `UMT5Model` (UMT5 model)
- **unispeech** -- `UniSpeechModel` (UniSpeech model)
- **unispeech-sat** -- `UniSpeechSatModel` (UniSpeechSat model)
- **univnet** -- `UnivNetModel` (UnivNet model)
- **van** -- `VanModel` (VAN model)
- **vaultgemma** -- `VaultGemmaModel` (VaultGemma model)
- **video_llava** -- `VideoLlavaModel` (VideoLlava model)
- **videomae** -- `VideoMAEModel` (VideoMAE model)
- **vilt** -- `ViltModel` (ViLT model)
- **vipllava** -- `VipLlavaModel` (VipLlava model)
- **vision-text-dual-encoder** -- `VisionTextDualEncoderModel` (VisionTextDualEncoder model)
- **visual_bert** -- `VisualBertModel` (VisualBERT model)
- **vit** -- [ViTModel](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTModel) (ViT model)
- **vit_hybrid** -- `ViTHybridModel` (ViT Hybrid model)
- **vit_mae** -- `ViTMAEModel` (ViTMAE model)
- **vit_msn** -- `ViTMSNModel` (ViTMSN model)
- **vitdet** -- `VitDetModel` (VitDet model)
- **vits** -- `VitsModel` (VITS model)
- **vivit** -- [VivitModel](/docs/transformers/v4.57.1/ko/model_doc/vivit#transformers.VivitModel) (ViViT model)
- **vjepa2** -- `VJEPA2Model` (VJEPA2Model model)
- **voxtral** -- `VoxtralForConditionalGeneration` (Voxtral model)
- **voxtral_encoder** -- `VoxtralEncoder` (Voxtral Encoder model)
- **wav2vec2** -- `Wav2Vec2Model` (Wav2Vec2 model)
- **wav2vec2-bert** -- `Wav2Vec2BertModel` (Wav2Vec2-BERT model)
- **wav2vec2-conformer** -- `Wav2Vec2ConformerModel` (Wav2Vec2-Conformer model)
- **wavlm** -- `WavLMModel` (WavLM model)
- **whisper** -- [WhisperModel](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperModel) (Whisper model)
- **xclip** -- [XCLIPModel](/docs/transformers/v4.57.1/ko/model_doc/xclip#transformers.XCLIPModel) (X-CLIP model)
- **xcodec** -- `XcodecModel` (X-CODEC model)
- **xglm** -- `XGLMModel` (XGLM model)
- **xlm** -- `XLMModel` (XLM model)
- **xlm-prophetnet** -- `XLMProphetNetModel` (XLM-ProphetNet model)
- **xlm-roberta** -- `XLMRobertaModel` (XLM-RoBERTa model)
- **xlm-roberta-xl** -- `XLMRobertaXLModel` (XLM-RoBERTa-XL model)
- **xlnet** -- `XLNetModel` (XLNet model)
- **xlstm** -- `xLSTMModel` (xLSTM model)
- **xmod** -- `XmodModel` (X-MOD model)
- **yolos** -- `YolosModel` (YOLOS model)
- **yoso** -- `YosoModel` (YOSO model)
- **zamba** -- `ZambaModel` (Zamba model)
- **zamba2** -- `Zamba2Model` (Zamba2 model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModel

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModel.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModel.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModel.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModel[[transformers.TFAutoModel]][[transformers.TFAutoModel]]

#### transformers.TFAutoModel[[transformers.TFAutoModel]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L538)

This is a generic model class that will be instantiated as one of the base model classes of the library when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModel.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `TFAlbertModel` (ALBERT model)
  - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [TFBartModel](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.TFBartModel) (BART model)
  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [TFBertModel](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertModel) (BERT model)
  - `BlenderbotConfig` configuration class: `TFBlenderbotModel` (Blenderbot model)
  - `BlenderbotSmallConfig` configuration class: `TFBlenderbotSmallModel` (BlenderbotSmall model)
  - [BlipConfig](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.BlipConfig) configuration class: [TFBlipModel](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.TFBlipModel) (BLIP model)
  - [CLIPConfig](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPConfig) configuration class: [TFCLIPModel](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.TFCLIPModel) (CLIP model)
  - `CTRLConfig` configuration class: `TFCTRLModel` (CTRL model)
  - `CamembertConfig` configuration class: `TFCamembertModel` (CamemBERT model)
  - [ConvBertConfig](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertConfig) configuration class: [TFConvBertModel](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.TFConvBertModel) (ConvBERT model)
  - `ConvNextConfig` configuration class: `TFConvNextModel` (ConvNeXT model)
  - `ConvNextV2Config` configuration class: `TFConvNextV2Model` (ConvNeXTV2 model)
  - `CvtConfig` configuration class: `TFCvtModel` (CvT model)
  - `DPRConfig` configuration class: `TFDPRQuestionEncoder` (DPR model)
  - `Data2VecVisionConfig` configuration class: `TFData2VecVisionModel` (Data2VecVision model)
  - [DebertaConfig](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaConfig) configuration class: [TFDebertaModel](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.TFDebertaModel) (DeBERTa model)
  - [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [TFDebertaV2Model](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.TFDebertaV2Model) (DeBERTa-v2 model)
  - `DeiTConfig` configuration class: `TFDeiTModel` (DeiT model)
  - `DistilBertConfig` configuration class: `TFDistilBertModel` (DistilBERT model)
  - `EfficientFormerConfig` configuration class: `TFEfficientFormerModel` (EfficientFormer model)
  - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [TFElectraModel](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.TFElectraModel) (ELECTRA model)
  - [EsmConfig](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmConfig) configuration class: [TFEsmModel](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.TFEsmModel) (ESM model)
  - `FlaubertConfig` configuration class: `TFFlaubertModel` (FlauBERT model)
  - `FunnelConfig` configuration class: `TFFunnelModel` or `TFFunnelBaseModel` (Funnel Transformer model)
  - [GPT2Config](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Config) configuration class: [TFGPT2Model](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.TFGPT2Model) (OpenAI GPT-2 model)
  - `GPTJConfig` configuration class: `TFGPTJModel` (GPT-J model)
  - `GroupViTConfig` configuration class: `TFGroupViTModel` (GroupViT model)
  - `HubertConfig` configuration class: `TFHubertModel` (Hubert model)
  - `IdeficsConfig` configuration class: `TFIdeficsModel` (IDEFICS model)
  - `LEDConfig` configuration class: `TFLEDModel` (LED model)
  - `LayoutLMConfig` configuration class: `TFLayoutLMModel` (LayoutLM model)
  - `LayoutLMv3Config` configuration class: `TFLayoutLMv3Model` (LayoutLMv3 model)
  - `LongformerConfig` configuration class: `TFLongformerModel` (Longformer model)
  - `LxmertConfig` configuration class: `TFLxmertModel` (LXMERT model)
  - `MBartConfig` configuration class: `TFMBartModel` (mBART model)
  - `MPNetConfig` configuration class: `TFMPNetModel` (MPNet model)
  - `MT5Config` configuration class: `TFMT5Model` (MT5 model)
  - [MarianConfig](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.MarianConfig) configuration class: [TFMarianModel](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.TFMarianModel) (Marian model)
  - [MistralConfig](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralConfig) configuration class: [TFMistralModel](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.TFMistralModel) (Mistral model)
  - `MobileBertConfig` configuration class: `TFMobileBertModel` (MobileBERT model)
  - `MobileViTConfig` configuration class: `TFMobileViTModel` (MobileViT model)
  - `OPTConfig` configuration class: `TFOPTModel` (OPT model)
  - [OpenAIGPTConfig](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTConfig) configuration class: [TFOpenAIGPTModel](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.TFOpenAIGPTModel) (OpenAI GPT model)
  - `PegasusConfig` configuration class: `TFPegasusModel` (Pegasus model)
  - `RegNetConfig` configuration class: `TFRegNetModel` (RegNet model)
  - `RemBertConfig` configuration class: `TFRemBertModel` (RemBERT model)
  - `ResNetConfig` configuration class: `TFResNetModel` (ResNet model)
  - `RoFormerConfig` configuration class: `TFRoFormerModel` (RoFormer model)
  - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [TFRobertaModel](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.TFRobertaModel) (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormModel` (RoBERTa-PreLayerNorm model)
  - `SamConfig` configuration class: `TFSamModel` (SAM model)
  - `SamVisionConfig` configuration class: `TFSamVisionModel` (SamVisionModel model)
  - `SegformerConfig` configuration class: `TFSegformerModel` (SegFormer model)
  - `Speech2TextConfig` configuration class: `TFSpeech2TextModel` (Speech2Text model)
  - `SwiftFormerConfig` configuration class: `TFSwiftFormerModel` (SwiftFormer model)
  - [SwinConfig](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.SwinConfig) configuration class: [TFSwinModel](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.TFSwinModel) (Swin Transformer model)
  - `T5Config` configuration class: `TFT5Model` (T5 model)
  - `TapasConfig` configuration class: `TFTapasModel` (TAPAS model)
  - `TransfoXLConfig` configuration class: `TFTransfoXLModel` (Transformer-XL model)
  - [ViTConfig](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTConfig) configuration class: [TFViTModel](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.TFViTModel) (ViT model)
  - `ViTMAEConfig` configuration class: `TFViTMAEModel` (ViTMAE model)
  - `VisionTextDualEncoderConfig` configuration class: `TFVisionTextDualEncoderModel` (VisionTextDualEncoder model)
  - `Wav2Vec2Config` configuration class: `TFWav2Vec2Model` (Wav2Vec2 model)
  - [WhisperConfig](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperConfig) configuration class: [TFWhisperModel](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.TFWhisperModel) (Whisper model)
  - `XGLMConfig` configuration class: `TFXGLMModel` (XGLM model)
  - `XLMConfig` configuration class: `TFXLMModel` (XLM model)
  - `XLMRobertaConfig` configuration class: `TFXLMRobertaModel` (XLM-RoBERTa model)
  - `XLNetConfig` configuration class: `TFXLNetModel` (XLNet model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the base model classes of the library from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModel

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModel.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `TFAlbertModel` (ALBERT model) - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [TFBartModel](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.TFBartModel) (BART model) - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [TFBertModel](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertModel) (BERT model) - `BlenderbotConfig` configuration class: `TFBlenderbotModel` (Blenderbot model) - `BlenderbotSmallConfig` configuration class: `TFBlenderbotSmallModel` (BlenderbotSmall model) - [BlipConfig](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.BlipConfig) configuration class: [TFBlipModel](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.TFBlipModel) (BLIP model) - [CLIPConfig](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPConfig) configuration class: [TFCLIPModel](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.TFCLIPModel) (CLIP model) - `CTRLConfig` configuration class: `TFCTRLModel` (CTRL model) - `CamembertConfig` configuration class: `TFCamembertModel` (CamemBERT model) - [ConvBertConfig](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertConfig) configuration class: [TFConvBertModel](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.TFConvBertModel) (ConvBERT model) - `ConvNextConfig` configuration class: `TFConvNextModel` (ConvNeXT model) - `ConvNextV2Config` configuration class: `TFConvNextV2Model` (ConvNeXTV2 model) - `CvtConfig` configuration class: `TFCvtModel` (CvT model) - `DPRConfig` configuration class: `TFDPRQuestionEncoder` (DPR model) - `Data2VecVisionConfig` configuration class: `TFData2VecVisionModel` (Data2VecVision model) - [DebertaConfig](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaConfig) configuration class: [TFDebertaModel](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.TFDebertaModel) (DeBERTa model) - [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [TFDebertaV2Model](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.TFDebertaV2Model) (DeBERTa-v2 model) - `DeiTConfig` configuration class: `TFDeiTModel` (DeiT model) - `DistilBertConfig` configuration class: `TFDistilBertModel` (DistilBERT model) - `EfficientFormerConfig` configuration class: `TFEfficientFormerModel` (EfficientFormer model) - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [TFElectraModel](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.TFElectraModel) (ELECTRA model) - [EsmConfig](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmConfig) configuration class: [TFEsmModel](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.TFEsmModel) (ESM model) - `FlaubertConfig` configuration class: `TFFlaubertModel` (FlauBERT model) - `FunnelConfig` configuration class: `TFFunnelModel` or `TFFunnelBaseModel` (Funnel Transformer model) - [GPT2Config](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Config) configuration class: [TFGPT2Model](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.TFGPT2Model) (OpenAI GPT-2 model) - `GPTJConfig` configuration class: `TFGPTJModel` (GPT-J model) - `GroupViTConfig` configuration class: `TFGroupViTModel` (GroupViT model) - `HubertConfig` configuration class: `TFHubertModel` (Hubert model) - `IdeficsConfig` configuration class: `TFIdeficsModel` (IDEFICS model) - `LEDConfig` configuration class: `TFLEDModel` (LED model) - `LayoutLMConfig` configuration class: `TFLayoutLMModel` (LayoutLM model) - `LayoutLMv3Config` configuration class: `TFLayoutLMv3Model` (LayoutLMv3 model) - `LongformerConfig` configuration class: `TFLongformerModel` (Longformer model) - `LxmertConfig` configuration class: `TFLxmertModel` (LXMERT model) - `MBartConfig` configuration class: `TFMBartModel` (mBART model) - `MPNetConfig` configuration class: `TFMPNetModel` (MPNet model) - `MT5Config` configuration class: `TFMT5Model` (MT5 model) - [MarianConfig](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.MarianConfig) configuration class: [TFMarianModel](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.TFMarianModel) (Marian model) - [MistralConfig](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralConfig) configuration class: [TFMistralModel](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.TFMistralModel) (Mistral model) - `MobileBertConfig` configuration class: `TFMobileBertModel` (MobileBERT model) - `MobileViTConfig` configuration class: `TFMobileViTModel` (MobileViT model) - `OPTConfig` configuration class: `TFOPTModel` (OPT model) - [OpenAIGPTConfig](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTConfig) configuration class: [TFOpenAIGPTModel](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.TFOpenAIGPTModel) (OpenAI GPT model) - `PegasusConfig` configuration class: `TFPegasusModel` (Pegasus model) - `RegNetConfig` configuration class: `TFRegNetModel` (RegNet model) - `RemBertConfig` configuration class: `TFRemBertModel` (RemBERT model) - `ResNetConfig` configuration class: `TFResNetModel` (ResNet model) - `RoFormerConfig` configuration class: `TFRoFormerModel` (RoFormer model) - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [TFRobertaModel](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.TFRobertaModel) (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormModel` (RoBERTa-PreLayerNorm model) - `SamConfig` configuration class: `TFSamModel` (SAM model) - `SamVisionConfig` configuration class: `TFSamVisionModel` (SamVisionModel model) - `SegformerConfig` configuration class: `TFSegformerModel` (SegFormer model) - `Speech2TextConfig` configuration class: `TFSpeech2TextModel` (Speech2Text model) - `SwiftFormerConfig` configuration class: `TFSwiftFormerModel` (SwiftFormer model) - [SwinConfig](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.SwinConfig) configuration class: [TFSwinModel](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.TFSwinModel) (Swin Transformer model) - `T5Config` configuration class: `TFT5Model` (T5 model) - `TapasConfig` configuration class: `TFTapasModel` (TAPAS model) - `TransfoXLConfig` configuration class: `TFTransfoXLModel` (Transformer-XL model) - [ViTConfig](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTConfig) configuration class: [TFViTModel](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.TFViTModel) (ViT model) - `ViTMAEConfig` configuration class: `TFViTMAEModel` (ViTMAE model) - `VisionTextDualEncoderConfig` configuration class: `TFVisionTextDualEncoderModel` (VisionTextDualEncoder model) - `Wav2Vec2Config` configuration class: `TFWav2Vec2Model` (Wav2Vec2 model) - [WhisperConfig](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperConfig) configuration class: [TFWhisperModel](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.TFWhisperModel) (Whisper model) - `XGLMConfig` configuration class: `TFXGLMModel` (XGLM model) - `XLMConfig` configuration class: `TFXLMModel` (XLM model) - `XLMRobertaConfig` configuration class: `TFXLMRobertaModel` (XLM-RoBERTa model) - `XLNetConfig` configuration class: `TFXLNetModel` (XLNet model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModel.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the base model classes of the library from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- `TFAlbertModel` (ALBERT model)
- **bart** -- [TFBartModel](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.TFBartModel) (BART model)
- **bert** -- [TFBertModel](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertModel) (BERT model)
- **blenderbot** -- `TFBlenderbotModel` (Blenderbot model)
- **blenderbot-small** -- `TFBlenderbotSmallModel` (BlenderbotSmall model)
- **blip** -- [TFBlipModel](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.TFBlipModel) (BLIP model)
- **camembert** -- `TFCamembertModel` (CamemBERT model)
- **clip** -- [TFCLIPModel](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.TFCLIPModel) (CLIP model)
- **convbert** -- [TFConvBertModel](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.TFConvBertModel) (ConvBERT model)
- **convnext** -- `TFConvNextModel` (ConvNeXT model)
- **convnextv2** -- `TFConvNextV2Model` (ConvNeXTV2 model)
- **ctrl** -- `TFCTRLModel` (CTRL model)
- **cvt** -- `TFCvtModel` (CvT model)
- **data2vec-vision** -- `TFData2VecVisionModel` (Data2VecVision model)
- **deberta** -- [TFDebertaModel](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.TFDebertaModel) (DeBERTa model)
- **deberta-v2** -- [TFDebertaV2Model](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.TFDebertaV2Model) (DeBERTa-v2 model)
- **deit** -- `TFDeiTModel` (DeiT model)
- **distilbert** -- `TFDistilBertModel` (DistilBERT model)
- **dpr** -- `TFDPRQuestionEncoder` (DPR model)
- **efficientformer** -- `TFEfficientFormerModel` (EfficientFormer model)
- **electra** -- [TFElectraModel](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.TFElectraModel) (ELECTRA model)
- **esm** -- [TFEsmModel](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.TFEsmModel) (ESM model)
- **flaubert** -- `TFFlaubertModel` (FlauBERT model)
- **funnel** -- `TFFunnelModel` or `TFFunnelBaseModel` (Funnel Transformer model)
- **gpt-sw3** -- [TFGPT2Model](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.TFGPT2Model) (GPT-Sw3 model)
- **gpt2** -- [TFGPT2Model](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.TFGPT2Model) (OpenAI GPT-2 model)
- **gptj** -- `TFGPTJModel` (GPT-J model)
- **groupvit** -- `TFGroupViTModel` (GroupViT model)
- **hubert** -- `TFHubertModel` (Hubert model)
- **idefics** -- `TFIdeficsModel` (IDEFICS model)
- **layoutlm** -- `TFLayoutLMModel` (LayoutLM model)
- **layoutlmv3** -- `TFLayoutLMv3Model` (LayoutLMv3 model)
- **led** -- `TFLEDModel` (LED model)
- **longformer** -- `TFLongformerModel` (Longformer model)
- **lxmert** -- `TFLxmertModel` (LXMERT model)
- **marian** -- [TFMarianModel](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.TFMarianModel) (Marian model)
- **mbart** -- `TFMBartModel` (mBART model)
- **mistral** -- [TFMistralModel](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.TFMistralModel) (Mistral model)
- **mobilebert** -- `TFMobileBertModel` (MobileBERT model)
- **mobilevit** -- `TFMobileViTModel` (MobileViT model)
- **mpnet** -- `TFMPNetModel` (MPNet model)
- **mt5** -- `TFMT5Model` (MT5 model)
- **openai-gpt** -- [TFOpenAIGPTModel](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.TFOpenAIGPTModel) (OpenAI GPT model)
- **opt** -- `TFOPTModel` (OPT model)
- **pegasus** -- `TFPegasusModel` (Pegasus model)
- **regnet** -- `TFRegNetModel` (RegNet model)
- **rembert** -- `TFRemBertModel` (RemBERT model)
- **resnet** -- `TFResNetModel` (ResNet model)
- **roberta** -- [TFRobertaModel](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.TFRobertaModel) (RoBERTa model)
- **roberta-prelayernorm** -- `TFRobertaPreLayerNormModel` (RoBERTa-PreLayerNorm model)
- **roformer** -- `TFRoFormerModel` (RoFormer model)
- **sam** -- `TFSamModel` (SAM model)
- **sam_vision_model** -- `TFSamVisionModel` (SamVisionModel model)
- **segformer** -- `TFSegformerModel` (SegFormer model)
- **speech_to_text** -- `TFSpeech2TextModel` (Speech2Text model)
- **swiftformer** -- `TFSwiftFormerModel` (SwiftFormer model)
- **swin** -- [TFSwinModel](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.TFSwinModel) (Swin Transformer model)
- **t5** -- `TFT5Model` (T5 model)
- **tapas** -- `TFTapasModel` (TAPAS model)
- **transfo-xl** -- `TFTransfoXLModel` (Transformer-XL model)
- **vision-text-dual-encoder** -- `TFVisionTextDualEncoderModel` (VisionTextDualEncoder model)
- **vit** -- [TFViTModel](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.TFViTModel) (ViT model)
- **vit_mae** -- `TFViTMAEModel` (ViTMAE model)
- **wav2vec2** -- `TFWav2Vec2Model` (Wav2Vec2 model)
- **whisper** -- [TFWhisperModel](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.TFWhisperModel) (Whisper model)
- **xglm** -- `TFXGLMModel` (XGLM model)
- **xlm** -- `TFXLMModel` (XLM model)
- **xlm-roberta** -- `TFXLMRobertaModel` (XLM-RoBERTa model)
- **xlnet** -- `TFXLNetModel` (XLNet model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModel

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModel.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModel.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModel.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### FlaxAutoModel[[transformers.FlaxAutoModel]][[transformers.FlaxAutoModel]]

#### transformers.FlaxAutoModel[[transformers.FlaxAutoModel]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_flax_auto.py#L281)

This is a generic model class that will be instantiated as one of the base model classes of the library when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.FlaxAutoModel.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `FlaxAlbertModel` (ALBERT model)
  - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartModel](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.FlaxBartModel) (BART model)
  - `BeitConfig` configuration class: `FlaxBeitModel` (BEiT model)
  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertModel](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertModel) (BERT model)
  - `BigBirdConfig` configuration class: `FlaxBigBirdModel` (BigBird model)
  - `BlenderbotConfig` configuration class: `FlaxBlenderbotModel` (Blenderbot model)
  - `BlenderbotSmallConfig` configuration class: `FlaxBlenderbotSmallModel` (BlenderbotSmall model)
  - `BloomConfig` configuration class: `FlaxBloomModel` (BLOOM model)
  - [CLIPConfig](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPConfig) configuration class: [FlaxCLIPModel](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.FlaxCLIPModel) (CLIP model)
  - `Dinov2Config` configuration class: `FlaxDinov2Model` (DINOv2 model)
  - `DistilBertConfig` configuration class: `FlaxDistilBertModel` (DistilBERT model)
  - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [FlaxElectraModel](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.FlaxElectraModel) (ELECTRA model)
  - [GPT2Config](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Config) configuration class: [FlaxGPT2Model](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.FlaxGPT2Model) (OpenAI GPT-2 model)
  - `GPTJConfig` configuration class: `FlaxGPTJModel` (GPT-J model)
  - `GPTNeoConfig` configuration class: `FlaxGPTNeoModel` (GPT Neo model)
  - [GemmaConfig](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaConfig) configuration class: [FlaxGemmaModel](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.FlaxGemmaModel) (Gemma model)
  - [LlamaConfig](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaConfig) configuration class: `FlaxLlamaModel` (LLaMA model)
  - `LongT5Config` configuration class: `FlaxLongT5Model` (LongT5 model)
  - `MBartConfig` configuration class: `FlaxMBartModel` (mBART model)
  - `MT5Config` configuration class: `FlaxMT5Model` (MT5 model)
  - [MarianConfig](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.MarianConfig) configuration class: [FlaxMarianModel](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.FlaxMarianModel) (Marian model)
  - [MistralConfig](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralConfig) configuration class: [FlaxMistralModel](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.FlaxMistralModel) (Mistral model)
  - `OPTConfig` configuration class: `FlaxOPTModel` (OPT model)
  - `PegasusConfig` configuration class: `FlaxPegasusModel` (Pegasus model)
  - `RegNetConfig` configuration class: `FlaxRegNetModel` (RegNet model)
  - `ResNetConfig` configuration class: `FlaxResNetModel` (ResNet model)
  - `RoFormerConfig` configuration class: `FlaxRoFormerModel` (RoFormer model)
  - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [FlaxRobertaModel](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.FlaxRobertaModel) (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormModel` (RoBERTa-PreLayerNorm model)
  - `T5Config` configuration class: `FlaxT5Model` (T5 model)
  - [ViTConfig](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTConfig) configuration class: [FlaxViTModel](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.FlaxViTModel) (ViT model)
  - `VisionTextDualEncoderConfig` configuration class: `FlaxVisionTextDualEncoderModel` (VisionTextDualEncoder model)
  - `Wav2Vec2Config` configuration class: `FlaxWav2Vec2Model` (Wav2Vec2 model)
  - [WhisperConfig](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperConfig) configuration class: [FlaxWhisperModel](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.FlaxWhisperModel) (Whisper model)
  - `XGLMConfig` configuration class: `FlaxXGLMModel` (XGLM model)
  - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaModel` (XLM-RoBERTa model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the base model classes of the library from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModel

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModel.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `FlaxAlbertModel` (ALBERT model) - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartModel](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.FlaxBartModel) (BART model) - `BeitConfig` configuration class: `FlaxBeitModel` (BEiT model) - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertModel](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertModel) (BERT model) - `BigBirdConfig` configuration class: `FlaxBigBirdModel` (BigBird model) - `BlenderbotConfig` configuration class: `FlaxBlenderbotModel` (Blenderbot model) - `BlenderbotSmallConfig` configuration class: `FlaxBlenderbotSmallModel` (BlenderbotSmall model) - `BloomConfig` configuration class: `FlaxBloomModel` (BLOOM model) - [CLIPConfig](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPConfig) configuration class: [FlaxCLIPModel](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.FlaxCLIPModel) (CLIP model) - `Dinov2Config` configuration class: `FlaxDinov2Model` (DINOv2 model) - `DistilBertConfig` configuration class: `FlaxDistilBertModel` (DistilBERT model) - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [FlaxElectraModel](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.FlaxElectraModel) (ELECTRA model) - [GPT2Config](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Config) configuration class: [FlaxGPT2Model](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.FlaxGPT2Model) (OpenAI GPT-2 model) - `GPTJConfig` configuration class: `FlaxGPTJModel` (GPT-J model) - `GPTNeoConfig` configuration class: `FlaxGPTNeoModel` (GPT Neo model) - [GemmaConfig](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaConfig) configuration class: [FlaxGemmaModel](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.FlaxGemmaModel) (Gemma model) - [LlamaConfig](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaConfig) configuration class: `FlaxLlamaModel` (LLaMA model) - `LongT5Config` configuration class: `FlaxLongT5Model` (LongT5 model) - `MBartConfig` configuration class: `FlaxMBartModel` (mBART model) - `MT5Config` configuration class: `FlaxMT5Model` (MT5 model) - [MarianConfig](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.MarianConfig) configuration class: [FlaxMarianModel](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.FlaxMarianModel) (Marian model) - [MistralConfig](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralConfig) configuration class: [FlaxMistralModel](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.FlaxMistralModel) (Mistral model) - `OPTConfig` configuration class: `FlaxOPTModel` (OPT model) - `PegasusConfig` configuration class: `FlaxPegasusModel` (Pegasus model) - `RegNetConfig` configuration class: `FlaxRegNetModel` (RegNet model) - `ResNetConfig` configuration class: `FlaxResNetModel` (ResNet model) - `RoFormerConfig` configuration class: `FlaxRoFormerModel` (RoFormer model) - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [FlaxRobertaModel](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.FlaxRobertaModel) (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormModel` (RoBERTa-PreLayerNorm model) - `T5Config` configuration class: `FlaxT5Model` (T5 model) - [ViTConfig](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTConfig) configuration class: [FlaxViTModel](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.FlaxViTModel) (ViT model) - `VisionTextDualEncoderConfig` configuration class: `FlaxVisionTextDualEncoderModel` (VisionTextDualEncoder model) - `Wav2Vec2Config` configuration class: `FlaxWav2Vec2Model` (Wav2Vec2 model) - [WhisperConfig](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperConfig) configuration class: [FlaxWhisperModel](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.FlaxWhisperModel) (Whisper model) - `XGLMConfig` configuration class: `FlaxXGLMModel` (XGLM model) - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaModel` (XLM-RoBERTa model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.FlaxAutoModel.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the base model classes of the library from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- `FlaxAlbertModel` (ALBERT model)
- **bart** -- [FlaxBartModel](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.FlaxBartModel) (BART model)
- **beit** -- `FlaxBeitModel` (BEiT model)
- **bert** -- [FlaxBertModel](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertModel) (BERT model)
- **big_bird** -- `FlaxBigBirdModel` (BigBird model)
- **blenderbot** -- `FlaxBlenderbotModel` (Blenderbot model)
- **blenderbot-small** -- `FlaxBlenderbotSmallModel` (BlenderbotSmall model)
- **bloom** -- `FlaxBloomModel` (BLOOM model)
- **clip** -- [FlaxCLIPModel](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.FlaxCLIPModel) (CLIP model)
- **dinov2** -- `FlaxDinov2Model` (DINOv2 model)
- **distilbert** -- `FlaxDistilBertModel` (DistilBERT model)
- **electra** -- [FlaxElectraModel](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.FlaxElectraModel) (ELECTRA model)
- **gemma** -- [FlaxGemmaModel](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.FlaxGemmaModel) (Gemma model)
- **gpt-sw3** -- [FlaxGPT2Model](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.FlaxGPT2Model) (GPT-Sw3 model)
- **gpt2** -- [FlaxGPT2Model](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.FlaxGPT2Model) (OpenAI GPT-2 model)
- **gpt_neo** -- `FlaxGPTNeoModel` (GPT Neo model)
- **gptj** -- `FlaxGPTJModel` (GPT-J model)
- **llama** -- `FlaxLlamaModel` (LLaMA model)
- **longt5** -- `FlaxLongT5Model` (LongT5 model)
- **marian** -- [FlaxMarianModel](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.FlaxMarianModel) (Marian model)
- **mbart** -- `FlaxMBartModel` (mBART model)
- **mistral** -- [FlaxMistralModel](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.FlaxMistralModel) (Mistral model)
- **mt5** -- `FlaxMT5Model` (MT5 model)
- **opt** -- `FlaxOPTModel` (OPT model)
- **pegasus** -- `FlaxPegasusModel` (Pegasus model)
- **regnet** -- `FlaxRegNetModel` (RegNet model)
- **resnet** -- `FlaxResNetModel` (ResNet model)
- **roberta** -- [FlaxRobertaModel](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.FlaxRobertaModel) (RoBERTa model)
- **roberta-prelayernorm** -- `FlaxRobertaPreLayerNormModel` (RoBERTa-PreLayerNorm model)
- **roformer** -- `FlaxRoFormerModel` (RoFormer model)
- **t5** -- `FlaxT5Model` (T5 model)
- **vision-text-dual-encoder** -- `FlaxVisionTextDualEncoderModel` (VisionTextDualEncoder model)
- **vit** -- [FlaxViTModel](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.FlaxViTModel) (ViT model)
- **wav2vec2** -- `FlaxWav2Vec2Model` (Wav2Vec2 model)
- **whisper** -- [FlaxWhisperModel](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.FlaxWhisperModel) (Whisper model)
- **xglm** -- `FlaxXGLMModel` (XGLM model)
- **xlm-roberta** -- `FlaxXLMRobertaModel` (XLM-RoBERTa model)

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModel

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModel.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModel.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModel.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

## 일반적인 사전 학습 클래스[[generic-pretraining-classes]]

다음 자동 클래스들은 사전 훈련 헤드가 포함된 모델을 인스턴스화하는 데 사용할 수 있습니다.

### AutoModelForPreTraining[[transformers.AutoModelForPreTraining]][[transformers.AutoModelForPreTraining]]

#### transformers.AutoModelForPreTraining[[transformers.AutoModelForPreTraining]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L1947)

This is a generic model class that will be instantiated as one of the model classes of the library (with a pretraining head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForPreTraining.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `AlbertForPreTraining` (ALBERT model)
  - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [BartForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartForConditionalGeneration) (BART model)
  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [BertForPreTraining](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertForPreTraining) (BERT model)
  - `BigBirdConfig` configuration class: `BigBirdForPreTraining` (BigBird model)
  - `BloomConfig` configuration class: `BloomForCausalLM` (BLOOM model)
  - `CTRLConfig` configuration class: `CTRLLMHeadModel` (CTRL model)
  - `CamembertConfig` configuration class: `CamembertForMaskedLM` (CamemBERT model)
  - `ColPaliConfig` configuration class: `ColPaliForRetrieval` (ColPali model)
  - `ColQwen2Config` configuration class: `ColQwen2ForRetrieval` (ColQwen2 model)
  - `Data2VecTextConfig` configuration class: `Data2VecTextForMaskedLM` (Data2VecText model)
  - [DebertaConfig](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaForMaskedLM) (DeBERTa model)
  - [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2ForMaskedLM) (DeBERTa-v2 model)
  - `DistilBertConfig` configuration class: `DistilBertForMaskedLM` (DistilBERT model)
  - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForPreTraining](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraForPreTraining) (ELECTRA model)
  - `ErnieConfig` configuration class: `ErnieForPreTraining` (ERNIE model)
  - `EvollaConfig` configuration class: `EvollaForProteinText2Text` (Evolla model)
  - [Exaone4Config](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4Config) configuration class: [Exaone4ForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4ForCausalLM) (EXAONE-4.0 model)
  - `FNetConfig` configuration class: `FNetForPreTraining` (FNet model)
  - `FSMTConfig` configuration class: `FSMTForConditionalGeneration` (FairSeq Machine-Translation model)
  - `FalconMambaConfig` configuration class: `FalconMambaForCausalLM` (FalconMamba model)
  - `FlaubertConfig` configuration class: `FlaubertWithLMHeadModel` (FlauBERT model)
  - `FlavaConfig` configuration class: `FlavaForPreTraining` (FLAVA model)
  - `Florence2Config` configuration class: `Florence2ForConditionalGeneration` (Florence2 model)
  - `FunnelConfig` configuration class: `FunnelForPreTraining` (Funnel Transformer model)
  - [GPT2Config](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Config) configuration class: [GPT2LMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2LMHeadModel) (OpenAI GPT-2 model)
  - `GPTBigCodeConfig` configuration class: `GPTBigCodeForCausalLM` (GPTBigCode model)
  - `GPTSanJapaneseConfig` configuration class: `GPTSanJapaneseForConditionalGeneration` (GPTSAN-japanese model)
  - [Gemma3Config](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3Config) configuration class: [Gemma3ForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3ForConditionalGeneration) (Gemma3ForConditionalGeneration model)
  - `HieraConfig` configuration class: `HieraForPreTraining` (Hiera model)
  - `IBertConfig` configuration class: `IBertForMaskedLM` (I-BERT model)
  - `Idefics2Config` configuration class: `Idefics2ForConditionalGeneration` (Idefics2 model)
  - `Idefics3Config` configuration class: `Idefics3ForConditionalGeneration` (Idefics3 model)
  - `IdeficsConfig` configuration class: `IdeficsForVisionText2Text` (IDEFICS model)
  - `JanusConfig` configuration class: `JanusForConditionalGeneration` (Janus model)
  - `LayoutLMConfig` configuration class: `LayoutLMForMaskedLM` (LayoutLM model)
  - `LlavaConfig` configuration class: `LlavaForConditionalGeneration` (LLaVa model)
  - `LlavaNextConfig` configuration class: `LlavaNextForConditionalGeneration` (LLaVA-NeXT model)
  - `LlavaNextVideoConfig` configuration class: `LlavaNextVideoForConditionalGeneration` (LLaVa-NeXT-Video model)
  - `LlavaOnevisionConfig` configuration class: `LlavaOnevisionForConditionalGeneration` (LLaVA-Onevision model)
  - `LongformerConfig` configuration class: `LongformerForMaskedLM` (Longformer model)
  - `LukeConfig` configuration class: `LukeForMaskedLM` (LUKE model)
  - `LxmertConfig` configuration class: `LxmertForPreTraining` (LXMERT model)
  - `MPNetConfig` configuration class: `MPNetForMaskedLM` (MPNet model)
  - [Mamba2Config](/docs/transformers/v4.57.1/ko/model_doc/mamba2#transformers.Mamba2Config) configuration class: [Mamba2ForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/mamba2#transformers.Mamba2ForCausalLM) (mamba2 model)
  - [MambaConfig](/docs/transformers/v4.57.1/ko/model_doc/mamba#transformers.MambaConfig) configuration class: [MambaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/mamba#transformers.MambaForCausalLM) (Mamba model)
  - `MegaConfig` configuration class: `MegaForMaskedLM` (MEGA model)
  - `MegatronBertConfig` configuration class: `MegatronBertForPreTraining` (Megatron-BERT model)
  - `Mistral3Config` configuration class: `Mistral3ForConditionalGeneration` (Mistral3 model)
  - `MllamaConfig` configuration class: `MllamaForConditionalGeneration` (Mllama model)
  - `MobileBertConfig` configuration class: `MobileBertForPreTraining` (MobileBERT model)
  - `MptConfig` configuration class: `MptForCausalLM` (MPT model)
  - `MraConfig` configuration class: `MraForMaskedLM` (MRA model)
  - `MvpConfig` configuration class: `MvpForConditionalGeneration` (MVP model)
  - `NezhaConfig` configuration class: `NezhaForPreTraining` (Nezha model)
  - `NllbMoeConfig` configuration class: `NllbMoeForConditionalGeneration` (NLLB-MOE model)
  - [OpenAIGPTConfig](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTConfig) configuration class: [OpenAIGPTLMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTLMHeadModel) (OpenAI GPT model)
  - [PaliGemmaConfig](/docs/transformers/v4.57.1/ko/model_doc/paligemma#transformers.PaliGemmaConfig) configuration class: [PaliGemmaForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/paligemma#transformers.PaliGemmaForConditionalGeneration) (PaliGemma model)
  - `Qwen2AudioConfig` configuration class: `Qwen2AudioForConditionalGeneration` (Qwen2Audio model)
  - `RetriBertConfig` configuration class: `RetriBertModel` (RetriBERT model)
  - `RoCBertConfig` configuration class: `RoCBertForPreTraining` (RoCBert model)
  - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaForMaskedLM) (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model)
  - `RwkvConfig` configuration class: `RwkvForCausalLM` (RWKV model)
  - `SplinterConfig` configuration class: `SplinterForPreTraining` (Splinter model)
  - `SqueezeBertConfig` configuration class: `SqueezeBertForMaskedLM` (SqueezeBERT model)
  - `SwitchTransformersConfig` configuration class: `SwitchTransformersForConditionalGeneration` (SwitchTransformers model)
  - `T5Config` configuration class: `T5ForConditionalGeneration` (T5 model)
  - `T5GemmaConfig` configuration class: `T5GemmaForConditionalGeneration` (T5Gemma model)
  - `TapasConfig` configuration class: `TapasForMaskedLM` (TAPAS model)
  - `TransfoXLConfig` configuration class: `TransfoXLLMHeadModel` (Transformer-XL model)
  - `TvltConfig` configuration class: `TvltForPreTraining` (TVLT model)
  - `UniSpeechConfig` configuration class: `UniSpeechForPreTraining` (UniSpeech model)
  - `UniSpeechSatConfig` configuration class: `UniSpeechSatForPreTraining` (UniSpeechSat model)
  - `ViTMAEConfig` configuration class: `ViTMAEForPreTraining` (ViTMAE model)
  - `VideoLlavaConfig` configuration class: `VideoLlavaForConditionalGeneration` (VideoLlava model)
  - `VideoMAEConfig` configuration class: `VideoMAEForPreTraining` (VideoMAE model)
  - `VipLlavaConfig` configuration class: `VipLlavaForConditionalGeneration` (VipLlava model)
  - `VisualBertConfig` configuration class: `VisualBertForPreTraining` (VisualBERT model)
  - `VoxtralConfig` configuration class: `VoxtralForConditionalGeneration` (Voxtral model)
  - `Wav2Vec2Config` configuration class: `Wav2Vec2ForPreTraining` (Wav2Vec2 model)
  - `Wav2Vec2ConformerConfig` configuration class: `Wav2Vec2ConformerForPreTraining` (Wav2Vec2-Conformer model)
  - `XLMConfig` configuration class: `XLMWithLMHeadModel` (XLM model)
  - `XLMRobertaConfig` configuration class: `XLMRobertaForMaskedLM` (XLM-RoBERTa model)
  - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForMaskedLM` (XLM-RoBERTa-XL model)
  - `XLNetConfig` configuration class: `XLNetLMHeadModel` (XLNet model)
  - `XmodConfig` configuration class: `XmodForMaskedLM` (X-MOD model)
  - `xLSTMConfig` configuration class: `xLSTMForCausalLM` (xLSTM model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a pretraining head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForPreTraining

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForPreTraining.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `AlbertForPreTraining` (ALBERT model) - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [BartForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartForConditionalGeneration) (BART model) - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [BertForPreTraining](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertForPreTraining) (BERT model) - `BigBirdConfig` configuration class: `BigBirdForPreTraining` (BigBird model) - `BloomConfig` configuration class: `BloomForCausalLM` (BLOOM model) - `CTRLConfig` configuration class: `CTRLLMHeadModel` (CTRL model) - `CamembertConfig` configuration class: `CamembertForMaskedLM` (CamemBERT model) - `ColPaliConfig` configuration class: `ColPaliForRetrieval` (ColPali model) - `ColQwen2Config` configuration class: `ColQwen2ForRetrieval` (ColQwen2 model) - `Data2VecTextConfig` configuration class: `Data2VecTextForMaskedLM` (Data2VecText model) - [DebertaConfig](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaForMaskedLM) (DeBERTa model) - [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2ForMaskedLM) (DeBERTa-v2 model) - `DistilBertConfig` configuration class: `DistilBertForMaskedLM` (DistilBERT model) - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForPreTraining](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraForPreTraining) (ELECTRA model) - `ErnieConfig` configuration class: `ErnieForPreTraining` (ERNIE model) - `EvollaConfig` configuration class: `EvollaForProteinText2Text` (Evolla model) - [Exaone4Config](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4Config) configuration class: [Exaone4ForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4ForCausalLM) (EXAONE-4.0 model) - `FNetConfig` configuration class: `FNetForPreTraining` (FNet model) - `FSMTConfig` configuration class: `FSMTForConditionalGeneration` (FairSeq Machine-Translation model) - `FalconMambaConfig` configuration class: `FalconMambaForCausalLM` (FalconMamba model) - `FlaubertConfig` configuration class: `FlaubertWithLMHeadModel` (FlauBERT model) - `FlavaConfig` configuration class: `FlavaForPreTraining` (FLAVA model) - `Florence2Config` configuration class: `Florence2ForConditionalGeneration` (Florence2 model) - `FunnelConfig` configuration class: `FunnelForPreTraining` (Funnel Transformer model) - [GPT2Config](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Config) configuration class: [GPT2LMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2LMHeadModel) (OpenAI GPT-2 model) - `GPTBigCodeConfig` configuration class: `GPTBigCodeForCausalLM` (GPTBigCode model) - `GPTSanJapaneseConfig` configuration class: `GPTSanJapaneseForConditionalGeneration` (GPTSAN-japanese model) - [Gemma3Config](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3Config) configuration class: [Gemma3ForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3ForConditionalGeneration) (Gemma3ForConditionalGeneration model) - `HieraConfig` configuration class: `HieraForPreTraining` (Hiera model) - `IBertConfig` configuration class: `IBertForMaskedLM` (I-BERT model) - `Idefics2Config` configuration class: `Idefics2ForConditionalGeneration` (Idefics2 model) - `Idefics3Config` configuration class: `Idefics3ForConditionalGeneration` (Idefics3 model) - `IdeficsConfig` configuration class: `IdeficsForVisionText2Text` (IDEFICS model) - `JanusConfig` configuration class: `JanusForConditionalGeneration` (Janus model) - `LayoutLMConfig` configuration class: `LayoutLMForMaskedLM` (LayoutLM model) - `LlavaConfig` configuration class: `LlavaForConditionalGeneration` (LLaVa model) - `LlavaNextConfig` configuration class: `LlavaNextForConditionalGeneration` (LLaVA-NeXT model) - `LlavaNextVideoConfig` configuration class: `LlavaNextVideoForConditionalGeneration` (LLaVa-NeXT-Video model) - `LlavaOnevisionConfig` configuration class: `LlavaOnevisionForConditionalGeneration` (LLaVA-Onevision model) - `LongformerConfig` configuration class: `LongformerForMaskedLM` (Longformer model) - `LukeConfig` configuration class: `LukeForMaskedLM` (LUKE model) - `LxmertConfig` configuration class: `LxmertForPreTraining` (LXMERT model) - `MPNetConfig` configuration class: `MPNetForMaskedLM` (MPNet model) - [Mamba2Config](/docs/transformers/v4.57.1/ko/model_doc/mamba2#transformers.Mamba2Config) configuration class: [Mamba2ForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/mamba2#transformers.Mamba2ForCausalLM) (mamba2 model) - [MambaConfig](/docs/transformers/v4.57.1/ko/model_doc/mamba#transformers.MambaConfig) configuration class: [MambaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/mamba#transformers.MambaForCausalLM) (Mamba model) - `MegaConfig` configuration class: `MegaForMaskedLM` (MEGA model) - `MegatronBertConfig` configuration class: `MegatronBertForPreTraining` (Megatron-BERT model) - `Mistral3Config` configuration class: `Mistral3ForConditionalGeneration` (Mistral3 model) - `MllamaConfig` configuration class: `MllamaForConditionalGeneration` (Mllama model) - `MobileBertConfig` configuration class: `MobileBertForPreTraining` (MobileBERT model) - `MptConfig` configuration class: `MptForCausalLM` (MPT model) - `MraConfig` configuration class: `MraForMaskedLM` (MRA model) - `MvpConfig` configuration class: `MvpForConditionalGeneration` (MVP model) - `NezhaConfig` configuration class: `NezhaForPreTraining` (Nezha model) - `NllbMoeConfig` configuration class: `NllbMoeForConditionalGeneration` (NLLB-MOE model) - [OpenAIGPTConfig](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTConfig) configuration class: [OpenAIGPTLMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTLMHeadModel) (OpenAI GPT model) - [PaliGemmaConfig](/docs/transformers/v4.57.1/ko/model_doc/paligemma#transformers.PaliGemmaConfig) configuration class: [PaliGemmaForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/paligemma#transformers.PaliGemmaForConditionalGeneration) (PaliGemma model) - `Qwen2AudioConfig` configuration class: `Qwen2AudioForConditionalGeneration` (Qwen2Audio model) - `RetriBertConfig` configuration class: `RetriBertModel` (RetriBERT model) - `RoCBertConfig` configuration class: `RoCBertForPreTraining` (RoCBert model) - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaForMaskedLM) (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model) - `RwkvConfig` configuration class: `RwkvForCausalLM` (RWKV model) - `SplinterConfig` configuration class: `SplinterForPreTraining` (Splinter model) - `SqueezeBertConfig` configuration class: `SqueezeBertForMaskedLM` (SqueezeBERT model) - `SwitchTransformersConfig` configuration class: `SwitchTransformersForConditionalGeneration` (SwitchTransformers model) - `T5Config` configuration class: `T5ForConditionalGeneration` (T5 model) - `T5GemmaConfig` configuration class: `T5GemmaForConditionalGeneration` (T5Gemma model) - `TapasConfig` configuration class: `TapasForMaskedLM` (TAPAS model) - `TransfoXLConfig` configuration class: `TransfoXLLMHeadModel` (Transformer-XL model) - `TvltConfig` configuration class: `TvltForPreTraining` (TVLT model) - `UniSpeechConfig` configuration class: `UniSpeechForPreTraining` (UniSpeech model) - `UniSpeechSatConfig` configuration class: `UniSpeechSatForPreTraining` (UniSpeechSat model) - `ViTMAEConfig` configuration class: `ViTMAEForPreTraining` (ViTMAE model) - `VideoLlavaConfig` configuration class: `VideoLlavaForConditionalGeneration` (VideoLlava model) - `VideoMAEConfig` configuration class: `VideoMAEForPreTraining` (VideoMAE model) - `VipLlavaConfig` configuration class: `VipLlavaForConditionalGeneration` (VipLlava model) - `VisualBertConfig` configuration class: `VisualBertForPreTraining` (VisualBERT model) - `VoxtralConfig` configuration class: `VoxtralForConditionalGeneration` (Voxtral model) - `Wav2Vec2Config` configuration class: `Wav2Vec2ForPreTraining` (Wav2Vec2 model) - `Wav2Vec2ConformerConfig` configuration class: `Wav2Vec2ConformerForPreTraining` (Wav2Vec2-Conformer model) - `XLMConfig` configuration class: `XLMWithLMHeadModel` (XLM model) - `XLMRobertaConfig` configuration class: `XLMRobertaForMaskedLM` (XLM-RoBERTa model) - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForMaskedLM` (XLM-RoBERTa-XL model) - `XLNetConfig` configuration class: `XLNetLMHeadModel` (XLNet model) - `XmodConfig` configuration class: `XmodForMaskedLM` (X-MOD model) - `xLSTMConfig` configuration class: `xLSTMForCausalLM` (xLSTM model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForPreTraining.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a pretraining head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- `AlbertForPreTraining` (ALBERT model)
- **bart** -- [BartForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartForConditionalGeneration) (BART model)
- **bert** -- [BertForPreTraining](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertForPreTraining) (BERT model)
- **big_bird** -- `BigBirdForPreTraining` (BigBird model)
- **bloom** -- `BloomForCausalLM` (BLOOM model)
- **camembert** -- `CamembertForMaskedLM` (CamemBERT model)
- **colpali** -- `ColPaliForRetrieval` (ColPali model)
- **colqwen2** -- `ColQwen2ForRetrieval` (ColQwen2 model)
- **ctrl** -- `CTRLLMHeadModel` (CTRL model)
- **data2vec-text** -- `Data2VecTextForMaskedLM` (Data2VecText model)
- **deberta** -- [DebertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaForMaskedLM) (DeBERTa model)
- **deberta-v2** -- [DebertaV2ForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2ForMaskedLM) (DeBERTa-v2 model)
- **distilbert** -- `DistilBertForMaskedLM` (DistilBERT model)
- **electra** -- [ElectraForPreTraining](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraForPreTraining) (ELECTRA model)
- **ernie** -- `ErnieForPreTraining` (ERNIE model)
- **evolla** -- `EvollaForProteinText2Text` (Evolla model)
- **exaone4** -- [Exaone4ForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4ForCausalLM) (EXAONE-4.0 model)
- **falcon_mamba** -- `FalconMambaForCausalLM` (FalconMamba model)
- **flaubert** -- `FlaubertWithLMHeadModel` (FlauBERT model)
- **flava** -- `FlavaForPreTraining` (FLAVA model)
- **florence2** -- `Florence2ForConditionalGeneration` (Florence2 model)
- **fnet** -- `FNetForPreTraining` (FNet model)
- **fsmt** -- `FSMTForConditionalGeneration` (FairSeq Machine-Translation model)
- **funnel** -- `FunnelForPreTraining` (Funnel Transformer model)
- **gemma3** -- [Gemma3ForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3ForConditionalGeneration) (Gemma3ForConditionalGeneration model)
- **gpt-sw3** -- [GPT2LMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2LMHeadModel) (GPT-Sw3 model)
- **gpt2** -- [GPT2LMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2LMHeadModel) (OpenAI GPT-2 model)
- **gpt_bigcode** -- `GPTBigCodeForCausalLM` (GPTBigCode model)
- **gptsan-japanese** -- `GPTSanJapaneseForConditionalGeneration` (GPTSAN-japanese model)
- **hiera** -- `HieraForPreTraining` (Hiera model)
- **ibert** -- `IBertForMaskedLM` (I-BERT model)
- **idefics** -- `IdeficsForVisionText2Text` (IDEFICS model)
- **idefics2** -- `Idefics2ForConditionalGeneration` (Idefics2 model)
- **idefics3** -- `Idefics3ForConditionalGeneration` (Idefics3 model)
- **janus** -- `JanusForConditionalGeneration` (Janus model)
- **layoutlm** -- `LayoutLMForMaskedLM` (LayoutLM model)
- **llava** -- `LlavaForConditionalGeneration` (LLaVa model)
- **llava_next** -- `LlavaNextForConditionalGeneration` (LLaVA-NeXT model)
- **llava_next_video** -- `LlavaNextVideoForConditionalGeneration` (LLaVa-NeXT-Video model)
- **llava_onevision** -- `LlavaOnevisionForConditionalGeneration` (LLaVA-Onevision model)
- **longformer** -- `LongformerForMaskedLM` (Longformer model)
- **luke** -- `LukeForMaskedLM` (LUKE model)
- **lxmert** -- `LxmertForPreTraining` (LXMERT model)
- **mamba** -- [MambaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/mamba#transformers.MambaForCausalLM) (Mamba model)
- **mamba2** -- [Mamba2ForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/mamba2#transformers.Mamba2ForCausalLM) (mamba2 model)
- **mega** -- `MegaForMaskedLM` (MEGA model)
- **megatron-bert** -- `MegatronBertForPreTraining` (Megatron-BERT model)
- **mistral3** -- `Mistral3ForConditionalGeneration` (Mistral3 model)
- **mllama** -- `MllamaForConditionalGeneration` (Mllama model)
- **mobilebert** -- `MobileBertForPreTraining` (MobileBERT model)
- **mpnet** -- `MPNetForMaskedLM` (MPNet model)
- **mpt** -- `MptForCausalLM` (MPT model)
- **mra** -- `MraForMaskedLM` (MRA model)
- **mvp** -- `MvpForConditionalGeneration` (MVP model)
- **nezha** -- `NezhaForPreTraining` (Nezha model)
- **nllb-moe** -- `NllbMoeForConditionalGeneration` (NLLB-MOE model)
- **openai-gpt** -- [OpenAIGPTLMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTLMHeadModel) (OpenAI GPT model)
- **paligemma** -- [PaliGemmaForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/paligemma#transformers.PaliGemmaForConditionalGeneration) (PaliGemma model)
- **qwen2_audio** -- `Qwen2AudioForConditionalGeneration` (Qwen2Audio model)
- **retribert** -- `RetriBertModel` (RetriBERT model)
- **roberta** -- [RobertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaForMaskedLM) (RoBERTa model)
- **roberta-prelayernorm** -- `RobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model)
- **roc_bert** -- `RoCBertForPreTraining` (RoCBert model)
- **rwkv** -- `RwkvForCausalLM` (RWKV model)
- **splinter** -- `SplinterForPreTraining` (Splinter model)
- **squeezebert** -- `SqueezeBertForMaskedLM` (SqueezeBERT model)
- **switch_transformers** -- `SwitchTransformersForConditionalGeneration` (SwitchTransformers model)
- **t5** -- `T5ForConditionalGeneration` (T5 model)
- **t5gemma** -- `T5GemmaForConditionalGeneration` (T5Gemma model)
- **tapas** -- `TapasForMaskedLM` (TAPAS model)
- **transfo-xl** -- `TransfoXLLMHeadModel` (Transformer-XL model)
- **tvlt** -- `TvltForPreTraining` (TVLT model)
- **unispeech** -- `UniSpeechForPreTraining` (UniSpeech model)
- **unispeech-sat** -- `UniSpeechSatForPreTraining` (UniSpeechSat model)
- **video_llava** -- `VideoLlavaForConditionalGeneration` (VideoLlava model)
- **videomae** -- `VideoMAEForPreTraining` (VideoMAE model)
- **vipllava** -- `VipLlavaForConditionalGeneration` (VipLlava model)
- **visual_bert** -- `VisualBertForPreTraining` (VisualBERT model)
- **vit_mae** -- `ViTMAEForPreTraining` (ViTMAE model)
- **voxtral** -- `VoxtralForConditionalGeneration` (Voxtral model)
- **wav2vec2** -- `Wav2Vec2ForPreTraining` (Wav2Vec2 model)
- **wav2vec2-conformer** -- `Wav2Vec2ConformerForPreTraining` (Wav2Vec2-Conformer model)
- **xlm** -- `XLMWithLMHeadModel` (XLM model)
- **xlm-roberta** -- `XLMRobertaForMaskedLM` (XLM-RoBERTa model)
- **xlm-roberta-xl** -- `XLMRobertaXLForMaskedLM` (XLM-RoBERTa-XL model)
- **xlnet** -- `XLNetLMHeadModel` (XLNet model)
- **xlstm** -- `xLSTMForCausalLM` (xLSTM model)
- **xmod** -- `XmodForMaskedLM` (X-MOD model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForPreTraining

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForPreTraining.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForPreTraining[[transformers.TFAutoModelForPreTraining]][[transformers.TFAutoModelForPreTraining]]

#### transformers.TFAutoModelForPreTraining[[transformers.TFAutoModelForPreTraining]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L554)

This is a generic model class that will be instantiated as one of the model classes of the library (with a pretraining head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForPreTraining.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `TFAlbertForPreTraining` (ALBERT model)
  - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [TFBartForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.TFBartForConditionalGeneration) (BART model)
  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForPreTraining](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertForPreTraining) (BERT model)
  - `CTRLConfig` configuration class: `TFCTRLLMHeadModel` (CTRL model)
  - `CamembertConfig` configuration class: `TFCamembertForMaskedLM` (CamemBERT model)
  - `DistilBertConfig` configuration class: `TFDistilBertForMaskedLM` (DistilBERT model)
  - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [TFElectraForPreTraining](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.TFElectraForPreTraining) (ELECTRA model)
  - `FlaubertConfig` configuration class: `TFFlaubertWithLMHeadModel` (FlauBERT model)
  - `FunnelConfig` configuration class: `TFFunnelForPreTraining` (Funnel Transformer model)
  - [GPT2Config](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Config) configuration class: [TFGPT2LMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.TFGPT2LMHeadModel) (OpenAI GPT-2 model)
  - `IdeficsConfig` configuration class: `TFIdeficsForVisionText2Text` (IDEFICS model)
  - `LayoutLMConfig` configuration class: `TFLayoutLMForMaskedLM` (LayoutLM model)
  - `LxmertConfig` configuration class: `TFLxmertForPreTraining` (LXMERT model)
  - `MPNetConfig` configuration class: `TFMPNetForMaskedLM` (MPNet model)
  - `MobileBertConfig` configuration class: `TFMobileBertForPreTraining` (MobileBERT model)
  - [OpenAIGPTConfig](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTConfig) configuration class: [TFOpenAIGPTLMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.TFOpenAIGPTLMHeadModel) (OpenAI GPT model)
  - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [TFRobertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.TFRobertaForMaskedLM) (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model)
  - `T5Config` configuration class: `TFT5ForConditionalGeneration` (T5 model)
  - `TapasConfig` configuration class: `TFTapasForMaskedLM` (TAPAS model)
  - `TransfoXLConfig` configuration class: `TFTransfoXLLMHeadModel` (Transformer-XL model)
  - `ViTMAEConfig` configuration class: `TFViTMAEForPreTraining` (ViTMAE model)
  - `XLMConfig` configuration class: `TFXLMWithLMHeadModel` (XLM model)
  - `XLMRobertaConfig` configuration class: `TFXLMRobertaForMaskedLM` (XLM-RoBERTa model)
  - `XLNetConfig` configuration class: `TFXLNetLMHeadModel` (XLNet model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a pretraining head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForPreTraining

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForPreTraining.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `TFAlbertForPreTraining` (ALBERT model) - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [TFBartForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.TFBartForConditionalGeneration) (BART model) - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForPreTraining](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertForPreTraining) (BERT model) - `CTRLConfig` configuration class: `TFCTRLLMHeadModel` (CTRL model) - `CamembertConfig` configuration class: `TFCamembertForMaskedLM` (CamemBERT model) - `DistilBertConfig` configuration class: `TFDistilBertForMaskedLM` (DistilBERT model) - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [TFElectraForPreTraining](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.TFElectraForPreTraining) (ELECTRA model) - `FlaubertConfig` configuration class: `TFFlaubertWithLMHeadModel` (FlauBERT model) - `FunnelConfig` configuration class: `TFFunnelForPreTraining` (Funnel Transformer model) - [GPT2Config](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Config) configuration class: [TFGPT2LMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.TFGPT2LMHeadModel) (OpenAI GPT-2 model) - `IdeficsConfig` configuration class: `TFIdeficsForVisionText2Text` (IDEFICS model) - `LayoutLMConfig` configuration class: `TFLayoutLMForMaskedLM` (LayoutLM model) - `LxmertConfig` configuration class: `TFLxmertForPreTraining` (LXMERT model) - `MPNetConfig` configuration class: `TFMPNetForMaskedLM` (MPNet model) - `MobileBertConfig` configuration class: `TFMobileBertForPreTraining` (MobileBERT model) - [OpenAIGPTConfig](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTConfig) configuration class: [TFOpenAIGPTLMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.TFOpenAIGPTLMHeadModel) (OpenAI GPT model) - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [TFRobertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.TFRobertaForMaskedLM) (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model) - `T5Config` configuration class: `TFT5ForConditionalGeneration` (T5 model) - `TapasConfig` configuration class: `TFTapasForMaskedLM` (TAPAS model) - `TransfoXLConfig` configuration class: `TFTransfoXLLMHeadModel` (Transformer-XL model) - `ViTMAEConfig` configuration class: `TFViTMAEForPreTraining` (ViTMAE model) - `XLMConfig` configuration class: `TFXLMWithLMHeadModel` (XLM model) - `XLMRobertaConfig` configuration class: `TFXLMRobertaForMaskedLM` (XLM-RoBERTa model) - `XLNetConfig` configuration class: `TFXLNetLMHeadModel` (XLNet model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForPreTraining.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a pretraining head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- `TFAlbertForPreTraining` (ALBERT model)
- **bart** -- [TFBartForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.TFBartForConditionalGeneration) (BART model)
- **bert** -- [TFBertForPreTraining](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertForPreTraining) (BERT model)
- **camembert** -- `TFCamembertForMaskedLM` (CamemBERT model)
- **ctrl** -- `TFCTRLLMHeadModel` (CTRL model)
- **distilbert** -- `TFDistilBertForMaskedLM` (DistilBERT model)
- **electra** -- [TFElectraForPreTraining](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.TFElectraForPreTraining) (ELECTRA model)
- **flaubert** -- `TFFlaubertWithLMHeadModel` (FlauBERT model)
- **funnel** -- `TFFunnelForPreTraining` (Funnel Transformer model)
- **gpt-sw3** -- [TFGPT2LMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.TFGPT2LMHeadModel) (GPT-Sw3 model)
- **gpt2** -- [TFGPT2LMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.TFGPT2LMHeadModel) (OpenAI GPT-2 model)
- **idefics** -- `TFIdeficsForVisionText2Text` (IDEFICS model)
- **layoutlm** -- `TFLayoutLMForMaskedLM` (LayoutLM model)
- **lxmert** -- `TFLxmertForPreTraining` (LXMERT model)
- **mobilebert** -- `TFMobileBertForPreTraining` (MobileBERT model)
- **mpnet** -- `TFMPNetForMaskedLM` (MPNet model)
- **openai-gpt** -- [TFOpenAIGPTLMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.TFOpenAIGPTLMHeadModel) (OpenAI GPT model)
- **roberta** -- [TFRobertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.TFRobertaForMaskedLM) (RoBERTa model)
- **roberta-prelayernorm** -- `TFRobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model)
- **t5** -- `TFT5ForConditionalGeneration` (T5 model)
- **tapas** -- `TFTapasForMaskedLM` (TAPAS model)
- **transfo-xl** -- `TFTransfoXLLMHeadModel` (Transformer-XL model)
- **vit_mae** -- `TFViTMAEForPreTraining` (ViTMAE model)
- **xlm** -- `TFXLMWithLMHeadModel` (XLM model)
- **xlm-roberta** -- `TFXLMRobertaForMaskedLM` (XLM-RoBERTa model)
- **xlnet** -- `TFXLNetLMHeadModel` (XLNet model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForPreTraining

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForPreTraining.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### FlaxAutoModelForPreTraining[[transformers.FlaxAutoModelForPreTraining]][[transformers.FlaxAutoModelForPreTraining]]

#### transformers.FlaxAutoModelForPreTraining[[transformers.FlaxAutoModelForPreTraining]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_flax_auto.py#L288)

This is a generic model class that will be instantiated as one of the model classes of the library (with a pretraining head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.FlaxAutoModelForPreTraining.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `FlaxAlbertForPreTraining` (ALBERT model)
  - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.FlaxBartForConditionalGeneration) (BART model)
  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForPreTraining](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertForPreTraining) (BERT model)
  - `BigBirdConfig` configuration class: `FlaxBigBirdForPreTraining` (BigBird model)
  - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [FlaxElectraForPreTraining](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.FlaxElectraForPreTraining) (ELECTRA model)
  - `LongT5Config` configuration class: `FlaxLongT5ForConditionalGeneration` (LongT5 model)
  - `MBartConfig` configuration class: `FlaxMBartForConditionalGeneration` (mBART model)
  - `MT5Config` configuration class: `FlaxMT5ForConditionalGeneration` (MT5 model)
  - `RoFormerConfig` configuration class: `FlaxRoFormerForMaskedLM` (RoFormer model)
  - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [FlaxRobertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.FlaxRobertaForMaskedLM) (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model)
  - `T5Config` configuration class: `FlaxT5ForConditionalGeneration` (T5 model)
  - `Wav2Vec2Config` configuration class: `FlaxWav2Vec2ForPreTraining` (Wav2Vec2 model)
  - [WhisperConfig](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperConfig) configuration class: [FlaxWhisperForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.FlaxWhisperForConditionalGeneration) (Whisper model)
  - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForMaskedLM` (XLM-RoBERTa model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a pretraining head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForPreTraining

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForPreTraining.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `FlaxAlbertForPreTraining` (ALBERT model) - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.FlaxBartForConditionalGeneration) (BART model) - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForPreTraining](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertForPreTraining) (BERT model) - `BigBirdConfig` configuration class: `FlaxBigBirdForPreTraining` (BigBird model) - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [FlaxElectraForPreTraining](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.FlaxElectraForPreTraining) (ELECTRA model) - `LongT5Config` configuration class: `FlaxLongT5ForConditionalGeneration` (LongT5 model) - `MBartConfig` configuration class: `FlaxMBartForConditionalGeneration` (mBART model) - `MT5Config` configuration class: `FlaxMT5ForConditionalGeneration` (MT5 model) - `RoFormerConfig` configuration class: `FlaxRoFormerForMaskedLM` (RoFormer model) - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [FlaxRobertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.FlaxRobertaForMaskedLM) (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model) - `T5Config` configuration class: `FlaxT5ForConditionalGeneration` (T5 model) - `Wav2Vec2Config` configuration class: `FlaxWav2Vec2ForPreTraining` (Wav2Vec2 model) - [WhisperConfig](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperConfig) configuration class: [FlaxWhisperForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.FlaxWhisperForConditionalGeneration) (Whisper model) - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForMaskedLM` (XLM-RoBERTa model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.FlaxAutoModelForPreTraining.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a pretraining head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- `FlaxAlbertForPreTraining` (ALBERT model)
- **bart** -- [FlaxBartForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.FlaxBartForConditionalGeneration) (BART model)
- **bert** -- [FlaxBertForPreTraining](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertForPreTraining) (BERT model)
- **big_bird** -- `FlaxBigBirdForPreTraining` (BigBird model)
- **electra** -- [FlaxElectraForPreTraining](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.FlaxElectraForPreTraining) (ELECTRA model)
- **longt5** -- `FlaxLongT5ForConditionalGeneration` (LongT5 model)
- **mbart** -- `FlaxMBartForConditionalGeneration` (mBART model)
- **mt5** -- `FlaxMT5ForConditionalGeneration` (MT5 model)
- **roberta** -- [FlaxRobertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.FlaxRobertaForMaskedLM) (RoBERTa model)
- **roberta-prelayernorm** -- `FlaxRobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model)
- **roformer** -- `FlaxRoFormerForMaskedLM` (RoFormer model)
- **t5** -- `FlaxT5ForConditionalGeneration` (T5 model)
- **wav2vec2** -- `FlaxWav2Vec2ForPreTraining` (Wav2Vec2 model)
- **whisper** -- [FlaxWhisperForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.FlaxWhisperForConditionalGeneration) (Whisper model)
- **xlm-roberta** -- `FlaxXLMRobertaForMaskedLM` (XLM-RoBERTa model)

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForPreTraining

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForPreTraining.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

## 자연어 처리[[natural-language-processing]]

다음 자동 클래스들은 아래의 자연어 처리 작업에 사용할 수 있습니다.

### AutoModelForCausalLM[[transformers.AutoModelForCausalLM]][[transformers.AutoModelForCausalLM]]

#### transformers.AutoModelForCausalLM[[transformers.AutoModelForCausalLM]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L1962)

This is a generic model class that will be instantiated as one of the model classes of the library (with a causal language modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForCausalLM.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `ApertusConfig` configuration class: `ApertusForCausalLM` (Apertus model)
  - `ArceeConfig` configuration class: `ArceeForCausalLM` (Arcee model)
  - `AriaTextConfig` configuration class: `AriaTextForCausalLM` (AriaText model)
  - `BambaConfig` configuration class: `BambaForCausalLM` (Bamba model)
  - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [BartForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartForCausalLM) (BART model)
  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [BertLMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertLMHeadModel) (BERT model)
  - `BertGenerationConfig` configuration class: `BertGenerationDecoder` (Bert Generation model)
  - `BigBirdConfig` configuration class: `BigBirdForCausalLM` (BigBird model)
  - `BigBirdPegasusConfig` configuration class: `BigBirdPegasusForCausalLM` (BigBird-Pegasus model)
  - [BioGptConfig](/docs/transformers/v4.57.1/ko/model_doc/biogpt#transformers.BioGptConfig) configuration class: [BioGptForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/biogpt#transformers.BioGptForCausalLM) (BioGpt model)
  - `BitNetConfig` configuration class: `BitNetForCausalLM` (BitNet model)
  - `BlenderbotConfig` configuration class: `BlenderbotForCausalLM` (Blenderbot model)
  - `BlenderbotSmallConfig` configuration class: `BlenderbotSmallForCausalLM` (BlenderbotSmall model)
  - `BloomConfig` configuration class: `BloomForCausalLM` (BLOOM model)
  - `BltConfig` configuration class: `BltForCausalLM` (Blt model)
  - `CTRLConfig` configuration class: `CTRLLMHeadModel` (CTRL model)
  - `CamembertConfig` configuration class: `CamembertForCausalLM` (CamemBERT model)
  - [CodeGenConfig](/docs/transformers/v4.57.1/ko/model_doc/codegen#transformers.CodeGenConfig) configuration class: [CodeGenForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/codegen#transformers.CodeGenForCausalLM) (CodeGen model)
  - `Cohere2Config` configuration class: `Cohere2ForCausalLM` (Cohere2 model)
  - [CohereConfig](/docs/transformers/v4.57.1/ko/model_doc/cohere#transformers.CohereConfig) configuration class: [CohereForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/cohere#transformers.CohereForCausalLM) (Cohere model)
  - `CpmAntConfig` configuration class: `CpmAntForCausalLM` (CPM-Ant model)
  - `Data2VecTextConfig` configuration class: `Data2VecTextForCausalLM` (Data2VecText model)
  - [DbrxConfig](/docs/transformers/v4.57.1/ko/model_doc/dbrx#transformers.DbrxConfig) configuration class: [DbrxForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/dbrx#transformers.DbrxForCausalLM) (DBRX model)
  - `DeepseekV2Config` configuration class: `DeepseekV2ForCausalLM` (DeepSeek-V2 model)
  - [DeepseekV3Config](/docs/transformers/v4.57.1/ko/model_doc/deepseek_v3#transformers.DeepseekV3Config) configuration class: [DeepseekV3ForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/deepseek_v3#transformers.DeepseekV3ForCausalLM) (DeepSeek-V3 model)
  - `DiffLlamaConfig` configuration class: `DiffLlamaForCausalLM` (DiffLlama model)
  - `DogeConfig` configuration class: `DogeForCausalLM` (Doge model)
  - `Dots1Config` configuration class: `Dots1ForCausalLM` (dots1 model)
  - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraForCausalLM) (ELECTRA model)
  - `Emu3Config` configuration class: `Emu3ForCausalLM` (Emu3 model)
  - `Ernie4_5Config` configuration class: `Ernie4_5ForCausalLM` (Ernie4_5 model)
  - `Ernie4_5_MoeConfig` configuration class: `Ernie4_5_MoeForCausalLM` (Ernie4_5_MoE model)
  - `ErnieConfig` configuration class: `ErnieForCausalLM` (ERNIE model)
  - [Exaone4Config](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4Config) configuration class: [Exaone4ForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4ForCausalLM) (EXAONE-4.0 model)
  - `FalconConfig` configuration class: `FalconForCausalLM` (Falcon model)
  - `FalconH1Config` configuration class: `FalconH1ForCausalLM` (FalconH1 model)
  - `FalconMambaConfig` configuration class: `FalconMambaForCausalLM` (FalconMamba model)
  - `FlexOlmoConfig` configuration class: `FlexOlmoForCausalLM` (FlexOlmo model)
  - `FuyuConfig` configuration class: `FuyuForCausalLM` (Fuyu model)
  - [GPT2Config](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Config) configuration class: [GPT2LMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2LMHeadModel) (OpenAI GPT-2 model)
  - `GPTBigCodeConfig` configuration class: `GPTBigCodeForCausalLM` (GPTBigCode model)
  - `GPTJConfig` configuration class: `GPTJForCausalLM` (GPT-J model)
  - `GPTNeoConfig` configuration class: `GPTNeoForCausalLM` (GPT Neo model)
  - `GPTNeoXConfig` configuration class: `GPTNeoXForCausalLM` (GPT NeoX model)
  - [GPTNeoXJapaneseConfig](/docs/transformers/v4.57.1/ko/model_doc/gpt_neox_japanese#transformers.GPTNeoXJapaneseConfig) configuration class: [GPTNeoXJapaneseForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/gpt_neox_japanese#transformers.GPTNeoXJapaneseForCausalLM) (GPT NeoX Japanese model)
  - [Gemma2Config](/docs/transformers/v4.57.1/ko/model_doc/gemma2#transformers.Gemma2Config) configuration class: [Gemma2ForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/gemma2#transformers.Gemma2ForCausalLM) (Gemma2 model)
  - [Gemma3Config](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3Config) configuration class: [Gemma3ForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3ForConditionalGeneration) (Gemma3ForConditionalGeneration model)
  - [Gemma3TextConfig](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3TextConfig) configuration class: [Gemma3ForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3ForCausalLM) (Gemma3ForCausalLM model)
  - `Gemma3nConfig` configuration class: `Gemma3nForConditionalGeneration` (Gemma3nForConditionalGeneration model)
  - `Gemma3nTextConfig` configuration class: `Gemma3nForCausalLM` (Gemma3nForCausalLM model)
  - [GemmaConfig](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaConfig) configuration class: [GemmaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaForCausalLM) (Gemma model)
  - `GitConfig` configuration class: `GitForCausalLM` (GIT model)
  - `Glm4Config` configuration class: `Glm4ForCausalLM` (GLM4 model)
  - `Glm4MoeConfig` configuration class: `Glm4MoeForCausalLM` (Glm4MoE model)
  - `GlmConfig` configuration class: `GlmForCausalLM` (GLM model)
  - `GotOcr2Config` configuration class: `GotOcr2ForConditionalGeneration` (GOT-OCR2 model)
  - `GptOssConfig` configuration class: `GptOssForCausalLM` (GptOss model)
  - `GraniteConfig` configuration class: `GraniteForCausalLM` (Granite model)
  - `GraniteMoeConfig` configuration class: `GraniteMoeForCausalLM` (GraniteMoeMoe model)
  - `GraniteMoeHybridConfig` configuration class: `GraniteMoeHybridForCausalLM` (GraniteMoeHybrid model)
  - `GraniteMoeSharedConfig` configuration class: `GraniteMoeSharedForCausalLM` (GraniteMoeSharedMoe model)
  - `HeliumConfig` configuration class: `HeliumForCausalLM` (Helium model)
  - `HunYuanDenseV1Config` configuration class: `HunYuanDenseV1ForCausalLM` (HunYuanDenseV1 model)
  - `HunYuanMoEV1Config` configuration class: `HunYuanMoEV1ForCausalLM` (HunYuanMoeV1 model)
  - [JambaConfig](/docs/transformers/v4.57.1/ko/model_doc/jamba#transformers.JambaConfig) configuration class: [JambaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/jamba#transformers.JambaForCausalLM) (Jamba model)
  - `JetMoeConfig` configuration class: `JetMoeForCausalLM` (JetMoe model)
  - `Lfm2Config` configuration class: `Lfm2ForCausalLM` (Lfm2 model)
  - `Llama4Config` configuration class: `Llama4ForCausalLM` (Llama4 model)
  - `Llama4TextConfig` configuration class: `Llama4ForCausalLM` (Llama4ForCausalLM model)
  - [LlamaConfig](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaConfig) configuration class: [LlamaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaForCausalLM) (LLaMA model)
  - `LongcatFlashConfig` configuration class: `LongcatFlashForCausalLM` (LongCatFlash model)
  - `MBartConfig` configuration class: `MBartForCausalLM` (mBART model)
  - [Mamba2Config](/docs/transformers/v4.57.1/ko/model_doc/mamba2#transformers.Mamba2Config) configuration class: [Mamba2ForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/mamba2#transformers.Mamba2ForCausalLM) (mamba2 model)
  - [MambaConfig](/docs/transformers/v4.57.1/ko/model_doc/mamba#transformers.MambaConfig) configuration class: [MambaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/mamba#transformers.MambaForCausalLM) (Mamba model)
  - [MarianConfig](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.MarianConfig) configuration class: [MarianForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.MarianForCausalLM) (Marian model)
  - `MegaConfig` configuration class: `MegaForCausalLM` (MEGA model)
  - `MegatronBertConfig` configuration class: `MegatronBertForCausalLM` (Megatron-BERT model)
  - `MiniMaxConfig` configuration class: `MiniMaxForCausalLM` (MiniMax model)
  - `MinistralConfig` configuration class: `MinistralForCausalLM` (Ministral model)
  - [MistralConfig](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralConfig) configuration class: [MistralForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralForCausalLM) (Mistral model)
  - `MixtralConfig` configuration class: `MixtralForCausalLM` (Mixtral model)
  - `MllamaConfig` configuration class: `MllamaForCausalLM` (Mllama model)
  - `ModernBertDecoderConfig` configuration class: `ModernBertDecoderForCausalLM` (ModernBertDecoder model)
  - `MoshiConfig` configuration class: `MoshiForCausalLM` (Moshi model)
  - `MptConfig` configuration class: `MptForCausalLM` (MPT model)
  - `MusicgenConfig` configuration class: `MusicgenForCausalLM` (MusicGen model)
  - `MusicgenMelodyConfig` configuration class: `MusicgenMelodyForCausalLM` (MusicGen Melody model)
  - `MvpConfig` configuration class: `MvpForCausalLM` (MVP model)
  - `NemotronConfig` configuration class: `NemotronForCausalLM` (Nemotron model)
  - `OPTConfig` configuration class: `OPTForCausalLM` (OPT model)
  - `Olmo2Config` configuration class: `Olmo2ForCausalLM` (OLMo2 model)
  - `Olmo3Config` configuration class: `Olmo3ForCausalLM` (Olmo3 model)
  - `OlmoConfig` configuration class: `OlmoForCausalLM` (OLMo model)
  - `OlmoeConfig` configuration class: `OlmoeForCausalLM` (OLMoE model)
  - [OpenAIGPTConfig](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTConfig) configuration class: [OpenAIGPTLMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTLMHeadModel) (OpenAI GPT model)
  - `OpenLlamaConfig` configuration class: `OpenLlamaForCausalLM` (OpenLlama model)
  - `PLBartConfig` configuration class: `PLBartForCausalLM` (PLBart model)
  - `PegasusConfig` configuration class: `PegasusForCausalLM` (Pegasus model)
  - `PersimmonConfig` configuration class: `PersimmonForCausalLM` (Persimmon model)
  - `Phi3Config` configuration class: `Phi3ForCausalLM` (Phi3 model)
  - `Phi4MultimodalConfig` configuration class: `Phi4MultimodalForCausalLM` (Phi4Multimodal model)
  - `PhiConfig` configuration class: `PhiForCausalLM` (Phi model)
  - `PhimoeConfig` configuration class: `PhimoeForCausalLM` (Phimoe model)
  - `ProphetNetConfig` configuration class: `ProphetNetForCausalLM` (ProphetNet model)
  - `QDQBertConfig` configuration class: `QDQBertLMHeadModel` (QDQBert model)
  - `Qwen2Config` configuration class: `Qwen2ForCausalLM` (Qwen2 model)
  - `Qwen2MoeConfig` configuration class: `Qwen2MoeForCausalLM` (Qwen2MoE model)
  - `Qwen3Config` configuration class: `Qwen3ForCausalLM` (Qwen3 model)
  - `Qwen3MoeConfig` configuration class: `Qwen3MoeForCausalLM` (Qwen3MoE model)
  - `Qwen3NextConfig` configuration class: `Qwen3NextForCausalLM` (Qwen3Next model)
  - `RecurrentGemmaConfig` configuration class: `RecurrentGemmaForCausalLM` (RecurrentGemma model)
  - `ReformerConfig` configuration class: `ReformerModelWithLMHead` (Reformer model)
  - `RemBertConfig` configuration class: `RemBertForCausalLM` (RemBERT model)
  - `RoCBertConfig` configuration class: `RoCBertForCausalLM` (RoCBert model)
  - `RoFormerConfig` configuration class: `RoFormerForCausalLM` (RoFormer model)
  - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaForCausalLM) (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForCausalLM` (RoBERTa-PreLayerNorm model)
  - `RwkvConfig` configuration class: `RwkvForCausalLM` (RWKV model)
  - `SeedOssConfig` configuration class: `SeedOssForCausalLM` (SeedOss model)
  - `SmolLM3Config` configuration class: `SmolLM3ForCausalLM` (SmolLM3 model)
  - `Speech2Text2Config` configuration class: `Speech2Text2ForCausalLM` (Speech2Text2 model)
  - `StableLmConfig` configuration class: `StableLmForCausalLM` (StableLm model)
  - `Starcoder2Config` configuration class: `Starcoder2ForCausalLM` (Starcoder2 model)
  - `TrOCRConfig` configuration class: `TrOCRForCausalLM` (TrOCR model)
  - `TransfoXLConfig` configuration class: `TransfoXLLMHeadModel` (Transformer-XL model)
  - `VaultGemmaConfig` configuration class: `VaultGemmaForCausalLM` (VaultGemma model)
  - [WhisperConfig](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperConfig) configuration class: `WhisperForCausalLM` (Whisper model)
  - `XGLMConfig` configuration class: `XGLMForCausalLM` (XGLM model)
  - `XLMConfig` configuration class: `XLMWithLMHeadModel` (XLM model)
  - `XLMProphetNetConfig` configuration class: `XLMProphetNetForCausalLM` (XLM-ProphetNet model)
  - `XLMRobertaConfig` configuration class: `XLMRobertaForCausalLM` (XLM-RoBERTa model)
  - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForCausalLM` (XLM-RoBERTa-XL model)
  - `XLNetConfig` configuration class: `XLNetLMHeadModel` (XLNet model)
  - `XmodConfig` configuration class: `XmodForCausalLM` (X-MOD model)
  - `Zamba2Config` configuration class: `Zamba2ForCausalLM` (Zamba2 model)
  - `ZambaConfig` configuration class: `ZambaForCausalLM` (Zamba model)
  - `xLSTMConfig` configuration class: `xLSTMForCausalLM` (xLSTM model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a causal language modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForCausalLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForCausalLM.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `ApertusConfig` configuration class: `ApertusForCausalLM` (Apertus model) - `ArceeConfig` configuration class: `ArceeForCausalLM` (Arcee model) - `AriaTextConfig` configuration class: `AriaTextForCausalLM` (AriaText model) - `BambaConfig` configuration class: `BambaForCausalLM` (Bamba model) - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [BartForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartForCausalLM) (BART model) - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [BertLMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertLMHeadModel) (BERT model) - `BertGenerationConfig` configuration class: `BertGenerationDecoder` (Bert Generation model) - `BigBirdConfig` configuration class: `BigBirdForCausalLM` (BigBird model) - `BigBirdPegasusConfig` configuration class: `BigBirdPegasusForCausalLM` (BigBird-Pegasus model) - [BioGptConfig](/docs/transformers/v4.57.1/ko/model_doc/biogpt#transformers.BioGptConfig) configuration class: [BioGptForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/biogpt#transformers.BioGptForCausalLM) (BioGpt model) - `BitNetConfig` configuration class: `BitNetForCausalLM` (BitNet model) - `BlenderbotConfig` configuration class: `BlenderbotForCausalLM` (Blenderbot model) - `BlenderbotSmallConfig` configuration class: `BlenderbotSmallForCausalLM` (BlenderbotSmall model) - `BloomConfig` configuration class: `BloomForCausalLM` (BLOOM model) - `BltConfig` configuration class: `BltForCausalLM` (Blt model) - `CTRLConfig` configuration class: `CTRLLMHeadModel` (CTRL model) - `CamembertConfig` configuration class: `CamembertForCausalLM` (CamemBERT model) - [CodeGenConfig](/docs/transformers/v4.57.1/ko/model_doc/codegen#transformers.CodeGenConfig) configuration class: [CodeGenForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/codegen#transformers.CodeGenForCausalLM) (CodeGen model) - `Cohere2Config` configuration class: `Cohere2ForCausalLM` (Cohere2 model) - [CohereConfig](/docs/transformers/v4.57.1/ko/model_doc/cohere#transformers.CohereConfig) configuration class: [CohereForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/cohere#transformers.CohereForCausalLM) (Cohere model) - `CpmAntConfig` configuration class: `CpmAntForCausalLM` (CPM-Ant model) - `Data2VecTextConfig` configuration class: `Data2VecTextForCausalLM` (Data2VecText model) - [DbrxConfig](/docs/transformers/v4.57.1/ko/model_doc/dbrx#transformers.DbrxConfig) configuration class: [DbrxForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/dbrx#transformers.DbrxForCausalLM) (DBRX model) - `DeepseekV2Config` configuration class: `DeepseekV2ForCausalLM` (DeepSeek-V2 model) - [DeepseekV3Config](/docs/transformers/v4.57.1/ko/model_doc/deepseek_v3#transformers.DeepseekV3Config) configuration class: [DeepseekV3ForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/deepseek_v3#transformers.DeepseekV3ForCausalLM) (DeepSeek-V3 model) - `DiffLlamaConfig` configuration class: `DiffLlamaForCausalLM` (DiffLlama model) - `DogeConfig` configuration class: `DogeForCausalLM` (Doge model) - `Dots1Config` configuration class: `Dots1ForCausalLM` (dots1 model) - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraForCausalLM) (ELECTRA model) - `Emu3Config` configuration class: `Emu3ForCausalLM` (Emu3 model) - `Ernie4_5Config` configuration class: `Ernie4_5ForCausalLM` (Ernie4_5 model) - `Ernie4_5_MoeConfig` configuration class: `Ernie4_5_MoeForCausalLM` (Ernie4_5_MoE model) - `ErnieConfig` configuration class: `ErnieForCausalLM` (ERNIE model) - [Exaone4Config](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4Config) configuration class: [Exaone4ForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4ForCausalLM) (EXAONE-4.0 model) - `FalconConfig` configuration class: `FalconForCausalLM` (Falcon model) - `FalconH1Config` configuration class: `FalconH1ForCausalLM` (FalconH1 model) - `FalconMambaConfig` configuration class: `FalconMambaForCausalLM` (FalconMamba model) - `FlexOlmoConfig` configuration class: `FlexOlmoForCausalLM` (FlexOlmo model) - `FuyuConfig` configuration class: `FuyuForCausalLM` (Fuyu model) - [GPT2Config](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Config) configuration class: [GPT2LMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2LMHeadModel) (OpenAI GPT-2 model) - `GPTBigCodeConfig` configuration class: `GPTBigCodeForCausalLM` (GPTBigCode model) - `GPTJConfig` configuration class: `GPTJForCausalLM` (GPT-J model) - `GPTNeoConfig` configuration class: `GPTNeoForCausalLM` (GPT Neo model) - `GPTNeoXConfig` configuration class: `GPTNeoXForCausalLM` (GPT NeoX model) - [GPTNeoXJapaneseConfig](/docs/transformers/v4.57.1/ko/model_doc/gpt_neox_japanese#transformers.GPTNeoXJapaneseConfig) configuration class: [GPTNeoXJapaneseForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/gpt_neox_japanese#transformers.GPTNeoXJapaneseForCausalLM) (GPT NeoX Japanese model) - [Gemma2Config](/docs/transformers/v4.57.1/ko/model_doc/gemma2#transformers.Gemma2Config) configuration class: [Gemma2ForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/gemma2#transformers.Gemma2ForCausalLM) (Gemma2 model) - [Gemma3Config](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3Config) configuration class: [Gemma3ForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3ForConditionalGeneration) (Gemma3ForConditionalGeneration model) - [Gemma3TextConfig](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3TextConfig) configuration class: [Gemma3ForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3ForCausalLM) (Gemma3ForCausalLM model) - `Gemma3nConfig` configuration class: `Gemma3nForConditionalGeneration` (Gemma3nForConditionalGeneration model) - `Gemma3nTextConfig` configuration class: `Gemma3nForCausalLM` (Gemma3nForCausalLM model) - [GemmaConfig](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaConfig) configuration class: [GemmaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaForCausalLM) (Gemma model) - `GitConfig` configuration class: `GitForCausalLM` (GIT model) - `Glm4Config` configuration class: `Glm4ForCausalLM` (GLM4 model) - `Glm4MoeConfig` configuration class: `Glm4MoeForCausalLM` (Glm4MoE model) - `GlmConfig` configuration class: `GlmForCausalLM` (GLM model) - `GotOcr2Config` configuration class: `GotOcr2ForConditionalGeneration` (GOT-OCR2 model) - `GptOssConfig` configuration class: `GptOssForCausalLM` (GptOss model) - `GraniteConfig` configuration class: `GraniteForCausalLM` (Granite model) - `GraniteMoeConfig` configuration class: `GraniteMoeForCausalLM` (GraniteMoeMoe model) - `GraniteMoeHybridConfig` configuration class: `GraniteMoeHybridForCausalLM` (GraniteMoeHybrid model) - `GraniteMoeSharedConfig` configuration class: `GraniteMoeSharedForCausalLM` (GraniteMoeSharedMoe model) - `HeliumConfig` configuration class: `HeliumForCausalLM` (Helium model) - `HunYuanDenseV1Config` configuration class: `HunYuanDenseV1ForCausalLM` (HunYuanDenseV1 model) - `HunYuanMoEV1Config` configuration class: `HunYuanMoEV1ForCausalLM` (HunYuanMoeV1 model) - [JambaConfig](/docs/transformers/v4.57.1/ko/model_doc/jamba#transformers.JambaConfig) configuration class: [JambaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/jamba#transformers.JambaForCausalLM) (Jamba model) - `JetMoeConfig` configuration class: `JetMoeForCausalLM` (JetMoe model) - `Lfm2Config` configuration class: `Lfm2ForCausalLM` (Lfm2 model) - `Llama4Config` configuration class: `Llama4ForCausalLM` (Llama4 model) - `Llama4TextConfig` configuration class: `Llama4ForCausalLM` (Llama4ForCausalLM model) - [LlamaConfig](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaConfig) configuration class: [LlamaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaForCausalLM) (LLaMA model) - `LongcatFlashConfig` configuration class: `LongcatFlashForCausalLM` (LongCatFlash model) - `MBartConfig` configuration class: `MBartForCausalLM` (mBART model) - [Mamba2Config](/docs/transformers/v4.57.1/ko/model_doc/mamba2#transformers.Mamba2Config) configuration class: [Mamba2ForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/mamba2#transformers.Mamba2ForCausalLM) (mamba2 model) - [MambaConfig](/docs/transformers/v4.57.1/ko/model_doc/mamba#transformers.MambaConfig) configuration class: [MambaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/mamba#transformers.MambaForCausalLM) (Mamba model) - [MarianConfig](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.MarianConfig) configuration class: [MarianForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.MarianForCausalLM) (Marian model) - `MegaConfig` configuration class: `MegaForCausalLM` (MEGA model) - `MegatronBertConfig` configuration class: `MegatronBertForCausalLM` (Megatron-BERT model) - `MiniMaxConfig` configuration class: `MiniMaxForCausalLM` (MiniMax model) - `MinistralConfig` configuration class: `MinistralForCausalLM` (Ministral model) - [MistralConfig](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralConfig) configuration class: [MistralForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralForCausalLM) (Mistral model) - `MixtralConfig` configuration class: `MixtralForCausalLM` (Mixtral model) - `MllamaConfig` configuration class: `MllamaForCausalLM` (Mllama model) - `ModernBertDecoderConfig` configuration class: `ModernBertDecoderForCausalLM` (ModernBertDecoder model) - `MoshiConfig` configuration class: `MoshiForCausalLM` (Moshi model) - `MptConfig` configuration class: `MptForCausalLM` (MPT model) - `MusicgenConfig` configuration class: `MusicgenForCausalLM` (MusicGen model) - `MusicgenMelodyConfig` configuration class: `MusicgenMelodyForCausalLM` (MusicGen Melody model) - `MvpConfig` configuration class: `MvpForCausalLM` (MVP model) - `NemotronConfig` configuration class: `NemotronForCausalLM` (Nemotron model) - `OPTConfig` configuration class: `OPTForCausalLM` (OPT model) - `Olmo2Config` configuration class: `Olmo2ForCausalLM` (OLMo2 model) - `Olmo3Config` configuration class: `Olmo3ForCausalLM` (Olmo3 model) - `OlmoConfig` configuration class: `OlmoForCausalLM` (OLMo model) - `OlmoeConfig` configuration class: `OlmoeForCausalLM` (OLMoE model) - [OpenAIGPTConfig](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTConfig) configuration class: [OpenAIGPTLMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTLMHeadModel) (OpenAI GPT model) - `OpenLlamaConfig` configuration class: `OpenLlamaForCausalLM` (OpenLlama model) - `PLBartConfig` configuration class: `PLBartForCausalLM` (PLBart model) - `PegasusConfig` configuration class: `PegasusForCausalLM` (Pegasus model) - `PersimmonConfig` configuration class: `PersimmonForCausalLM` (Persimmon model) - `Phi3Config` configuration class: `Phi3ForCausalLM` (Phi3 model) - `Phi4MultimodalConfig` configuration class: `Phi4MultimodalForCausalLM` (Phi4Multimodal model) - `PhiConfig` configuration class: `PhiForCausalLM` (Phi model) - `PhimoeConfig` configuration class: `PhimoeForCausalLM` (Phimoe model) - `ProphetNetConfig` configuration class: `ProphetNetForCausalLM` (ProphetNet model) - `QDQBertConfig` configuration class: `QDQBertLMHeadModel` (QDQBert model) - `Qwen2Config` configuration class: `Qwen2ForCausalLM` (Qwen2 model) - `Qwen2MoeConfig` configuration class: `Qwen2MoeForCausalLM` (Qwen2MoE model) - `Qwen3Config` configuration class: `Qwen3ForCausalLM` (Qwen3 model) - `Qwen3MoeConfig` configuration class: `Qwen3MoeForCausalLM` (Qwen3MoE model) - `Qwen3NextConfig` configuration class: `Qwen3NextForCausalLM` (Qwen3Next model) - `RecurrentGemmaConfig` configuration class: `RecurrentGemmaForCausalLM` (RecurrentGemma model) - `ReformerConfig` configuration class: `ReformerModelWithLMHead` (Reformer model) - `RemBertConfig` configuration class: `RemBertForCausalLM` (RemBERT model) - `RoCBertConfig` configuration class: `RoCBertForCausalLM` (RoCBert model) - `RoFormerConfig` configuration class: `RoFormerForCausalLM` (RoFormer model) - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaForCausalLM) (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForCausalLM` (RoBERTa-PreLayerNorm model) - `RwkvConfig` configuration class: `RwkvForCausalLM` (RWKV model) - `SeedOssConfig` configuration class: `SeedOssForCausalLM` (SeedOss model) - `SmolLM3Config` configuration class: `SmolLM3ForCausalLM` (SmolLM3 model) - `Speech2Text2Config` configuration class: `Speech2Text2ForCausalLM` (Speech2Text2 model) - `StableLmConfig` configuration class: `StableLmForCausalLM` (StableLm model) - `Starcoder2Config` configuration class: `Starcoder2ForCausalLM` (Starcoder2 model) - `TrOCRConfig` configuration class: `TrOCRForCausalLM` (TrOCR model) - `TransfoXLConfig` configuration class: `TransfoXLLMHeadModel` (Transformer-XL model) - `VaultGemmaConfig` configuration class: `VaultGemmaForCausalLM` (VaultGemma model) - [WhisperConfig](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperConfig) configuration class: `WhisperForCausalLM` (Whisper model) - `XGLMConfig` configuration class: `XGLMForCausalLM` (XGLM model) - `XLMConfig` configuration class: `XLMWithLMHeadModel` (XLM model) - `XLMProphetNetConfig` configuration class: `XLMProphetNetForCausalLM` (XLM-ProphetNet model) - `XLMRobertaConfig` configuration class: `XLMRobertaForCausalLM` (XLM-RoBERTa model) - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForCausalLM` (XLM-RoBERTa-XL model) - `XLNetConfig` configuration class: `XLNetLMHeadModel` (XLNet model) - `XmodConfig` configuration class: `XmodForCausalLM` (X-MOD model) - `Zamba2Config` configuration class: `Zamba2ForCausalLM` (Zamba2 model) - `ZambaConfig` configuration class: `ZambaForCausalLM` (Zamba model) - `xLSTMConfig` configuration class: `xLSTMForCausalLM` (xLSTM model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForCausalLM.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a causal language modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **apertus** -- `ApertusForCausalLM` (Apertus model)
- **arcee** -- `ArceeForCausalLM` (Arcee model)
- **aria_text** -- `AriaTextForCausalLM` (AriaText model)
- **bamba** -- `BambaForCausalLM` (Bamba model)
- **bart** -- [BartForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartForCausalLM) (BART model)
- **bert** -- [BertLMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertLMHeadModel) (BERT model)
- **bert-generation** -- `BertGenerationDecoder` (Bert Generation model)
- **big_bird** -- `BigBirdForCausalLM` (BigBird model)
- **bigbird_pegasus** -- `BigBirdPegasusForCausalLM` (BigBird-Pegasus model)
- **biogpt** -- [BioGptForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/biogpt#transformers.BioGptForCausalLM) (BioGpt model)
- **bitnet** -- `BitNetForCausalLM` (BitNet model)
- **blenderbot** -- `BlenderbotForCausalLM` (Blenderbot model)
- **blenderbot-small** -- `BlenderbotSmallForCausalLM` (BlenderbotSmall model)
- **bloom** -- `BloomForCausalLM` (BLOOM model)
- **blt** -- `BltForCausalLM` (Blt model)
- **camembert** -- `CamembertForCausalLM` (CamemBERT model)
- **code_llama** -- [LlamaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaForCausalLM) (CodeLlama model)
- **codegen** -- [CodeGenForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/codegen#transformers.CodeGenForCausalLM) (CodeGen model)
- **cohere** -- [CohereForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/cohere#transformers.CohereForCausalLM) (Cohere model)
- **cohere2** -- `Cohere2ForCausalLM` (Cohere2 model)
- **cpmant** -- `CpmAntForCausalLM` (CPM-Ant model)
- **ctrl** -- `CTRLLMHeadModel` (CTRL model)
- **data2vec-text** -- `Data2VecTextForCausalLM` (Data2VecText model)
- **dbrx** -- [DbrxForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/dbrx#transformers.DbrxForCausalLM) (DBRX model)
- **deepseek_v2** -- `DeepseekV2ForCausalLM` (DeepSeek-V2 model)
- **deepseek_v3** -- [DeepseekV3ForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/deepseek_v3#transformers.DeepseekV3ForCausalLM) (DeepSeek-V3 model)
- **diffllama** -- `DiffLlamaForCausalLM` (DiffLlama model)
- **doge** -- `DogeForCausalLM` (Doge model)
- **dots1** -- `Dots1ForCausalLM` (dots1 model)
- **electra** -- [ElectraForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraForCausalLM) (ELECTRA model)
- **emu3** -- `Emu3ForCausalLM` (Emu3 model)
- **ernie** -- `ErnieForCausalLM` (ERNIE model)
- **ernie4_5** -- `Ernie4_5ForCausalLM` (Ernie4_5 model)
- **ernie4_5_moe** -- `Ernie4_5_MoeForCausalLM` (Ernie4_5_MoE model)
- **exaone4** -- [Exaone4ForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4ForCausalLM) (EXAONE-4.0 model)
- **falcon** -- `FalconForCausalLM` (Falcon model)
- **falcon_h1** -- `FalconH1ForCausalLM` (FalconH1 model)
- **falcon_mamba** -- `FalconMambaForCausalLM` (FalconMamba model)
- **flex_olmo** -- `FlexOlmoForCausalLM` (FlexOlmo model)
- **fuyu** -- `FuyuForCausalLM` (Fuyu model)
- **gemma** -- [GemmaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaForCausalLM) (Gemma model)
- **gemma2** -- [Gemma2ForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/gemma2#transformers.Gemma2ForCausalLM) (Gemma2 model)
- **gemma3** -- [Gemma3ForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3ForConditionalGeneration) (Gemma3ForConditionalGeneration model)
- **gemma3_text** -- [Gemma3ForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3ForCausalLM) (Gemma3ForCausalLM model)
- **gemma3n** -- `Gemma3nForConditionalGeneration` (Gemma3nForConditionalGeneration model)
- **gemma3n_text** -- `Gemma3nForCausalLM` (Gemma3nForCausalLM model)
- **git** -- `GitForCausalLM` (GIT model)
- **glm** -- `GlmForCausalLM` (GLM model)
- **glm4** -- `Glm4ForCausalLM` (GLM4 model)
- **glm4_moe** -- `Glm4MoeForCausalLM` (Glm4MoE model)
- **got_ocr2** -- `GotOcr2ForConditionalGeneration` (GOT-OCR2 model)
- **gpt-sw3** -- [GPT2LMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2LMHeadModel) (GPT-Sw3 model)
- **gpt2** -- [GPT2LMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2LMHeadModel) (OpenAI GPT-2 model)
- **gpt_bigcode** -- `GPTBigCodeForCausalLM` (GPTBigCode model)
- **gpt_neo** -- `GPTNeoForCausalLM` (GPT Neo model)
- **gpt_neox** -- `GPTNeoXForCausalLM` (GPT NeoX model)
- **gpt_neox_japanese** -- [GPTNeoXJapaneseForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/gpt_neox_japanese#transformers.GPTNeoXJapaneseForCausalLM) (GPT NeoX Japanese model)
- **gpt_oss** -- `GptOssForCausalLM` (GptOss model)
- **gptj** -- `GPTJForCausalLM` (GPT-J model)
- **granite** -- `GraniteForCausalLM` (Granite model)
- **granitemoe** -- `GraniteMoeForCausalLM` (GraniteMoeMoe model)
- **granitemoehybrid** -- `GraniteMoeHybridForCausalLM` (GraniteMoeHybrid model)
- **granitemoeshared** -- `GraniteMoeSharedForCausalLM` (GraniteMoeSharedMoe model)
- **helium** -- `HeliumForCausalLM` (Helium model)
- **hunyuan_v1_dense** -- `HunYuanDenseV1ForCausalLM` (HunYuanDenseV1 model)
- **hunyuan_v1_moe** -- `HunYuanMoEV1ForCausalLM` (HunYuanMoeV1 model)
- **jamba** -- [JambaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/jamba#transformers.JambaForCausalLM) (Jamba model)
- **jetmoe** -- `JetMoeForCausalLM` (JetMoe model)
- **lfm2** -- `Lfm2ForCausalLM` (Lfm2 model)
- **llama** -- [LlamaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaForCausalLM) (LLaMA model)
- **llama4** -- `Llama4ForCausalLM` (Llama4 model)
- **llama4_text** -- `Llama4ForCausalLM` (Llama4ForCausalLM model)
- **longcat_flash** -- `LongcatFlashForCausalLM` (LongCatFlash model)
- **mamba** -- [MambaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/mamba#transformers.MambaForCausalLM) (Mamba model)
- **mamba2** -- [Mamba2ForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/mamba2#transformers.Mamba2ForCausalLM) (mamba2 model)
- **marian** -- [MarianForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.MarianForCausalLM) (Marian model)
- **mbart** -- `MBartForCausalLM` (mBART model)
- **mega** -- `MegaForCausalLM` (MEGA model)
- **megatron-bert** -- `MegatronBertForCausalLM` (Megatron-BERT model)
- **minimax** -- `MiniMaxForCausalLM` (MiniMax model)
- **ministral** -- `MinistralForCausalLM` (Ministral model)
- **mistral** -- [MistralForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralForCausalLM) (Mistral model)
- **mixtral** -- `MixtralForCausalLM` (Mixtral model)
- **mllama** -- `MllamaForCausalLM` (Mllama model)
- **modernbert-decoder** -- `ModernBertDecoderForCausalLM` (ModernBertDecoder model)
- **moshi** -- `MoshiForCausalLM` (Moshi model)
- **mpt** -- `MptForCausalLM` (MPT model)
- **musicgen** -- `MusicgenForCausalLM` (MusicGen model)
- **musicgen_melody** -- `MusicgenMelodyForCausalLM` (MusicGen Melody model)
- **mvp** -- `MvpForCausalLM` (MVP model)
- **nemotron** -- `NemotronForCausalLM` (Nemotron model)
- **olmo** -- `OlmoForCausalLM` (OLMo model)
- **olmo2** -- `Olmo2ForCausalLM` (OLMo2 model)
- **olmo3** -- `Olmo3ForCausalLM` (Olmo3 model)
- **olmoe** -- `OlmoeForCausalLM` (OLMoE model)
- **open-llama** -- `OpenLlamaForCausalLM` (OpenLlama model)
- **openai-gpt** -- [OpenAIGPTLMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTLMHeadModel) (OpenAI GPT model)
- **opt** -- `OPTForCausalLM` (OPT model)
- **pegasus** -- `PegasusForCausalLM` (Pegasus model)
- **persimmon** -- `PersimmonForCausalLM` (Persimmon model)
- **phi** -- `PhiForCausalLM` (Phi model)
- **phi3** -- `Phi3ForCausalLM` (Phi3 model)
- **phi4_multimodal** -- `Phi4MultimodalForCausalLM` (Phi4Multimodal model)
- **phimoe** -- `PhimoeForCausalLM` (Phimoe model)
- **plbart** -- `PLBartForCausalLM` (PLBart model)
- **prophetnet** -- `ProphetNetForCausalLM` (ProphetNet model)
- **qdqbert** -- `QDQBertLMHeadModel` (QDQBert model)
- **qwen2** -- `Qwen2ForCausalLM` (Qwen2 model)
- **qwen2_moe** -- `Qwen2MoeForCausalLM` (Qwen2MoE model)
- **qwen3** -- `Qwen3ForCausalLM` (Qwen3 model)
- **qwen3_moe** -- `Qwen3MoeForCausalLM` (Qwen3MoE model)
- **qwen3_next** -- `Qwen3NextForCausalLM` (Qwen3Next model)
- **recurrent_gemma** -- `RecurrentGemmaForCausalLM` (RecurrentGemma model)
- **reformer** -- `ReformerModelWithLMHead` (Reformer model)
- **rembert** -- `RemBertForCausalLM` (RemBERT model)
- **roberta** -- [RobertaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaForCausalLM) (RoBERTa model)
- **roberta-prelayernorm** -- `RobertaPreLayerNormForCausalLM` (RoBERTa-PreLayerNorm model)
- **roc_bert** -- `RoCBertForCausalLM` (RoCBert model)
- **roformer** -- `RoFormerForCausalLM` (RoFormer model)
- **rwkv** -- `RwkvForCausalLM` (RWKV model)
- **seed_oss** -- `SeedOssForCausalLM` (SeedOss model)
- **smollm3** -- `SmolLM3ForCausalLM` (SmolLM3 model)
- **speech_to_text_2** -- `Speech2Text2ForCausalLM` (Speech2Text2 model)
- **stablelm** -- `StableLmForCausalLM` (StableLm model)
- **starcoder2** -- `Starcoder2ForCausalLM` (Starcoder2 model)
- **transfo-xl** -- `TransfoXLLMHeadModel` (Transformer-XL model)
- **trocr** -- `TrOCRForCausalLM` (TrOCR model)
- **vaultgemma** -- `VaultGemmaForCausalLM` (VaultGemma model)
- **whisper** -- `WhisperForCausalLM` (Whisper model)
- **xglm** -- `XGLMForCausalLM` (XGLM model)
- **xlm** -- `XLMWithLMHeadModel` (XLM model)
- **xlm-prophetnet** -- `XLMProphetNetForCausalLM` (XLM-ProphetNet model)
- **xlm-roberta** -- `XLMRobertaForCausalLM` (XLM-RoBERTa model)
- **xlm-roberta-xl** -- `XLMRobertaXLForCausalLM` (XLM-RoBERTa-XL model)
- **xlnet** -- `XLNetLMHeadModel` (XLNet model)
- **xlstm** -- `xLSTMForCausalLM` (xLSTM model)
- **xmod** -- `XmodForCausalLM` (X-MOD model)
- **zamba** -- `ZambaForCausalLM` (Zamba model)
- **zamba2** -- `Zamba2ForCausalLM` (Zamba2 model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForCausalLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForCausalLM.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForCausalLM[[transformers.TFAutoModelForCausalLM]][[transformers.TFAutoModelForCausalLM]]

#### transformers.TFAutoModelForCausalLM[[transformers.TFAutoModelForCausalLM]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L569)

This is a generic model class that will be instantiated as one of the model classes of the library (with a causal language modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForCausalLM.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [TFBertLMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertLMHeadModel) (BERT model)
  - `CTRLConfig` configuration class: `TFCTRLLMHeadModel` (CTRL model)
  - `CamembertConfig` configuration class: `TFCamembertForCausalLM` (CamemBERT model)
  - [GPT2Config](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Config) configuration class: [TFGPT2LMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.TFGPT2LMHeadModel) (OpenAI GPT-2 model)
  - `GPTJConfig` configuration class: `TFGPTJForCausalLM` (GPT-J model)
  - [MistralConfig](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralConfig) configuration class: [TFMistralForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.TFMistralForCausalLM) (Mistral model)
  - `OPTConfig` configuration class: `TFOPTForCausalLM` (OPT model)
  - [OpenAIGPTConfig](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTConfig) configuration class: [TFOpenAIGPTLMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.TFOpenAIGPTLMHeadModel) (OpenAI GPT model)
  - `RemBertConfig` configuration class: `TFRemBertForCausalLM` (RemBERT model)
  - `RoFormerConfig` configuration class: `TFRoFormerForCausalLM` (RoFormer model)
  - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [TFRobertaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.TFRobertaForCausalLM) (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForCausalLM` (RoBERTa-PreLayerNorm model)
  - `TransfoXLConfig` configuration class: `TFTransfoXLLMHeadModel` (Transformer-XL model)
  - `XGLMConfig` configuration class: `TFXGLMForCausalLM` (XGLM model)
  - `XLMConfig` configuration class: `TFXLMWithLMHeadModel` (XLM model)
  - `XLMRobertaConfig` configuration class: `TFXLMRobertaForCausalLM` (XLM-RoBERTa model)
  - `XLNetConfig` configuration class: `TFXLNetLMHeadModel` (XLNet model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a causal language modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForCausalLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForCausalLM.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [TFBertLMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertLMHeadModel) (BERT model) - `CTRLConfig` configuration class: `TFCTRLLMHeadModel` (CTRL model) - `CamembertConfig` configuration class: `TFCamembertForCausalLM` (CamemBERT model) - [GPT2Config](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Config) configuration class: [TFGPT2LMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.TFGPT2LMHeadModel) (OpenAI GPT-2 model) - `GPTJConfig` configuration class: `TFGPTJForCausalLM` (GPT-J model) - [MistralConfig](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralConfig) configuration class: [TFMistralForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.TFMistralForCausalLM) (Mistral model) - `OPTConfig` configuration class: `TFOPTForCausalLM` (OPT model) - [OpenAIGPTConfig](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTConfig) configuration class: [TFOpenAIGPTLMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.TFOpenAIGPTLMHeadModel) (OpenAI GPT model) - `RemBertConfig` configuration class: `TFRemBertForCausalLM` (RemBERT model) - `RoFormerConfig` configuration class: `TFRoFormerForCausalLM` (RoFormer model) - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [TFRobertaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.TFRobertaForCausalLM) (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForCausalLM` (RoBERTa-PreLayerNorm model) - `TransfoXLConfig` configuration class: `TFTransfoXLLMHeadModel` (Transformer-XL model) - `XGLMConfig` configuration class: `TFXGLMForCausalLM` (XGLM model) - `XLMConfig` configuration class: `TFXLMWithLMHeadModel` (XLM model) - `XLMRobertaConfig` configuration class: `TFXLMRobertaForCausalLM` (XLM-RoBERTa model) - `XLNetConfig` configuration class: `TFXLNetLMHeadModel` (XLNet model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForCausalLM.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a causal language modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **bert** -- [TFBertLMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertLMHeadModel) (BERT model)
- **camembert** -- `TFCamembertForCausalLM` (CamemBERT model)
- **ctrl** -- `TFCTRLLMHeadModel` (CTRL model)
- **gpt-sw3** -- [TFGPT2LMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.TFGPT2LMHeadModel) (GPT-Sw3 model)
- **gpt2** -- [TFGPT2LMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.TFGPT2LMHeadModel) (OpenAI GPT-2 model)
- **gptj** -- `TFGPTJForCausalLM` (GPT-J model)
- **mistral** -- [TFMistralForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.TFMistralForCausalLM) (Mistral model)
- **openai-gpt** -- [TFOpenAIGPTLMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.TFOpenAIGPTLMHeadModel) (OpenAI GPT model)
- **opt** -- `TFOPTForCausalLM` (OPT model)
- **rembert** -- `TFRemBertForCausalLM` (RemBERT model)
- **roberta** -- [TFRobertaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.TFRobertaForCausalLM) (RoBERTa model)
- **roberta-prelayernorm** -- `TFRobertaPreLayerNormForCausalLM` (RoBERTa-PreLayerNorm model)
- **roformer** -- `TFRoFormerForCausalLM` (RoFormer model)
- **transfo-xl** -- `TFTransfoXLLMHeadModel` (Transformer-XL model)
- **xglm** -- `TFXGLMForCausalLM` (XGLM model)
- **xlm** -- `TFXLMWithLMHeadModel` (XLM model)
- **xlm-roberta** -- `TFXLMRobertaForCausalLM` (XLM-RoBERTa model)
- **xlnet** -- `TFXLNetLMHeadModel` (XLNet model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForCausalLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForCausalLM.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### FlaxAutoModelForCausalLM[[transformers.FlaxAutoModelForCausalLM]][[transformers.FlaxAutoModelForCausalLM]]

#### transformers.FlaxAutoModelForCausalLM[[transformers.FlaxAutoModelForCausalLM]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_flax_auto.py#L295)

This is a generic model class that will be instantiated as one of the model classes of the library (with a causal language modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.FlaxAutoModelForCausalLM.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.FlaxBartForCausalLM) (BART model)
  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertForCausalLM) (BERT model)
  - `BigBirdConfig` configuration class: `FlaxBigBirdForCausalLM` (BigBird model)
  - `BloomConfig` configuration class: `FlaxBloomForCausalLM` (BLOOM model)
  - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [FlaxElectraForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.FlaxElectraForCausalLM) (ELECTRA model)
  - [GPT2Config](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Config) configuration class: [FlaxGPT2LMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.FlaxGPT2LMHeadModel) (OpenAI GPT-2 model)
  - `GPTJConfig` configuration class: `FlaxGPTJForCausalLM` (GPT-J model)
  - `GPTNeoConfig` configuration class: `FlaxGPTNeoForCausalLM` (GPT Neo model)
  - [GemmaConfig](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaConfig) configuration class: [FlaxGemmaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.FlaxGemmaForCausalLM) (Gemma model)
  - [LlamaConfig](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaConfig) configuration class: `FlaxLlamaForCausalLM` (LLaMA model)
  - [MistralConfig](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralConfig) configuration class: [FlaxMistralForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.FlaxMistralForCausalLM) (Mistral model)
  - `OPTConfig` configuration class: `FlaxOPTForCausalLM` (OPT model)
  - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [FlaxRobertaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.FlaxRobertaForCausalLM) (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForCausalLM` (RoBERTa-PreLayerNorm model)
  - `XGLMConfig` configuration class: `FlaxXGLMForCausalLM` (XGLM model)
  - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForCausalLM` (XLM-RoBERTa model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a causal language modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForCausalLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForCausalLM.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.FlaxBartForCausalLM) (BART model) - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertForCausalLM) (BERT model) - `BigBirdConfig` configuration class: `FlaxBigBirdForCausalLM` (BigBird model) - `BloomConfig` configuration class: `FlaxBloomForCausalLM` (BLOOM model) - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [FlaxElectraForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.FlaxElectraForCausalLM) (ELECTRA model) - [GPT2Config](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Config) configuration class: [FlaxGPT2LMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.FlaxGPT2LMHeadModel) (OpenAI GPT-2 model) - `GPTJConfig` configuration class: `FlaxGPTJForCausalLM` (GPT-J model) - `GPTNeoConfig` configuration class: `FlaxGPTNeoForCausalLM` (GPT Neo model) - [GemmaConfig](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaConfig) configuration class: [FlaxGemmaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.FlaxGemmaForCausalLM) (Gemma model) - [LlamaConfig](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaConfig) configuration class: `FlaxLlamaForCausalLM` (LLaMA model) - [MistralConfig](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralConfig) configuration class: [FlaxMistralForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.FlaxMistralForCausalLM) (Mistral model) - `OPTConfig` configuration class: `FlaxOPTForCausalLM` (OPT model) - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [FlaxRobertaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.FlaxRobertaForCausalLM) (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForCausalLM` (RoBERTa-PreLayerNorm model) - `XGLMConfig` configuration class: `FlaxXGLMForCausalLM` (XGLM model) - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForCausalLM` (XLM-RoBERTa model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.FlaxAutoModelForCausalLM.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a causal language modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **bart** -- [FlaxBartForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.FlaxBartForCausalLM) (BART model)
- **bert** -- [FlaxBertForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertForCausalLM) (BERT model)
- **big_bird** -- `FlaxBigBirdForCausalLM` (BigBird model)
- **bloom** -- `FlaxBloomForCausalLM` (BLOOM model)
- **electra** -- [FlaxElectraForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.FlaxElectraForCausalLM) (ELECTRA model)
- **gemma** -- [FlaxGemmaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.FlaxGemmaForCausalLM) (Gemma model)
- **gpt-sw3** -- [FlaxGPT2LMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.FlaxGPT2LMHeadModel) (GPT-Sw3 model)
- **gpt2** -- [FlaxGPT2LMHeadModel](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.FlaxGPT2LMHeadModel) (OpenAI GPT-2 model)
- **gpt_neo** -- `FlaxGPTNeoForCausalLM` (GPT Neo model)
- **gptj** -- `FlaxGPTJForCausalLM` (GPT-J model)
- **llama** -- `FlaxLlamaForCausalLM` (LLaMA model)
- **mistral** -- [FlaxMistralForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.FlaxMistralForCausalLM) (Mistral model)
- **opt** -- `FlaxOPTForCausalLM` (OPT model)
- **roberta** -- [FlaxRobertaForCausalLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.FlaxRobertaForCausalLM) (RoBERTa model)
- **roberta-prelayernorm** -- `FlaxRobertaPreLayerNormForCausalLM` (RoBERTa-PreLayerNorm model)
- **xglm** -- `FlaxXGLMForCausalLM` (XGLM model)
- **xlm-roberta** -- `FlaxXLMRobertaForCausalLM` (XLM-RoBERTa model)

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForCausalLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForCausalLM.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForMaskedLM[[transformers.AutoModelForMaskedLM]][[transformers.AutoModelForMaskedLM]]

#### transformers.AutoModelForMaskedLM[[transformers.AutoModelForMaskedLM]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L1979)

This is a generic model class that will be instantiated as one of the model classes of the library (with a masked language modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForMaskedLM.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `AlbertForMaskedLM` (ALBERT model)
  - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [BartForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartForConditionalGeneration) (BART model)
  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [BertForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertForMaskedLM) (BERT model)
  - `BigBirdConfig` configuration class: `BigBirdForMaskedLM` (BigBird model)
  - `CamembertConfig` configuration class: `CamembertForMaskedLM` (CamemBERT model)
  - [ConvBertConfig](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertForMaskedLM) (ConvBERT model)
  - `Data2VecTextConfig` configuration class: `Data2VecTextForMaskedLM` (Data2VecText model)
  - [DebertaConfig](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaForMaskedLM) (DeBERTa model)
  - [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2ForMaskedLM) (DeBERTa-v2 model)
  - `DistilBertConfig` configuration class: `DistilBertForMaskedLM` (DistilBERT model)
  - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraForMaskedLM) (ELECTRA model)
  - `ErnieConfig` configuration class: `ErnieForMaskedLM` (ERNIE model)
  - [EsmConfig](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmConfig) configuration class: [EsmForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmForMaskedLM) (ESM model)
  - `FNetConfig` configuration class: `FNetForMaskedLM` (FNet model)
  - `FlaubertConfig` configuration class: `FlaubertWithLMHeadModel` (FlauBERT model)
  - `FunnelConfig` configuration class: `FunnelForMaskedLM` (Funnel Transformer model)
  - `IBertConfig` configuration class: `IBertForMaskedLM` (I-BERT model)
  - `LayoutLMConfig` configuration class: `LayoutLMForMaskedLM` (LayoutLM model)
  - `LongformerConfig` configuration class: `LongformerForMaskedLM` (Longformer model)
  - `LukeConfig` configuration class: `LukeForMaskedLM` (LUKE model)
  - `MBartConfig` configuration class: `MBartForConditionalGeneration` (mBART model)
  - `MPNetConfig` configuration class: `MPNetForMaskedLM` (MPNet model)
  - `MegaConfig` configuration class: `MegaForMaskedLM` (MEGA model)
  - `MegatronBertConfig` configuration class: `MegatronBertForMaskedLM` (Megatron-BERT model)
  - `MobileBertConfig` configuration class: `MobileBertForMaskedLM` (MobileBERT model)
  - `ModernBertConfig` configuration class: `ModernBertForMaskedLM` (ModernBERT model)
  - `MraConfig` configuration class: `MraForMaskedLM` (MRA model)
  - `MvpConfig` configuration class: `MvpForConditionalGeneration` (MVP model)
  - `NezhaConfig` configuration class: `NezhaForMaskedLM` (Nezha model)
  - `NystromformerConfig` configuration class: `NystromformerForMaskedLM` (Nyströmformer model)
  - `PerceiverConfig` configuration class: `PerceiverForMaskedLM` (Perceiver model)
  - `QDQBertConfig` configuration class: `QDQBertForMaskedLM` (QDQBert model)
  - `ReformerConfig` configuration class: `ReformerForMaskedLM` (Reformer model)
  - `RemBertConfig` configuration class: `RemBertForMaskedLM` (RemBERT model)
  - `RoCBertConfig` configuration class: `RoCBertForMaskedLM` (RoCBert model)
  - `RoFormerConfig` configuration class: `RoFormerForMaskedLM` (RoFormer model)
  - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaForMaskedLM) (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model)
  - `SqueezeBertConfig` configuration class: `SqueezeBertForMaskedLM` (SqueezeBERT model)
  - `TapasConfig` configuration class: `TapasForMaskedLM` (TAPAS model)
  - `Wav2Vec2Config` configuration class: `Wav2Vec2ForMaskedLM` (Wav2Vec2 model)
  - `XLMConfig` configuration class: `XLMWithLMHeadModel` (XLM model)
  - `XLMRobertaConfig` configuration class: `XLMRobertaForMaskedLM` (XLM-RoBERTa model)
  - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForMaskedLM` (XLM-RoBERTa-XL model)
  - `XmodConfig` configuration class: `XmodForMaskedLM` (X-MOD model)
  - `YosoConfig` configuration class: `YosoForMaskedLM` (YOSO model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a masked language modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForMaskedLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForMaskedLM.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `AlbertForMaskedLM` (ALBERT model) - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [BartForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartForConditionalGeneration) (BART model) - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [BertForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertForMaskedLM) (BERT model) - `BigBirdConfig` configuration class: `BigBirdForMaskedLM` (BigBird model) - `CamembertConfig` configuration class: `CamembertForMaskedLM` (CamemBERT model) - [ConvBertConfig](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertForMaskedLM) (ConvBERT model) - `Data2VecTextConfig` configuration class: `Data2VecTextForMaskedLM` (Data2VecText model) - [DebertaConfig](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaForMaskedLM) (DeBERTa model) - [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2ForMaskedLM) (DeBERTa-v2 model) - `DistilBertConfig` configuration class: `DistilBertForMaskedLM` (DistilBERT model) - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraForMaskedLM) (ELECTRA model) - `ErnieConfig` configuration class: `ErnieForMaskedLM` (ERNIE model) - [EsmConfig](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmConfig) configuration class: [EsmForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmForMaskedLM) (ESM model) - `FNetConfig` configuration class: `FNetForMaskedLM` (FNet model) - `FlaubertConfig` configuration class: `FlaubertWithLMHeadModel` (FlauBERT model) - `FunnelConfig` configuration class: `FunnelForMaskedLM` (Funnel Transformer model) - `IBertConfig` configuration class: `IBertForMaskedLM` (I-BERT model) - `LayoutLMConfig` configuration class: `LayoutLMForMaskedLM` (LayoutLM model) - `LongformerConfig` configuration class: `LongformerForMaskedLM` (Longformer model) - `LukeConfig` configuration class: `LukeForMaskedLM` (LUKE model) - `MBartConfig` configuration class: `MBartForConditionalGeneration` (mBART model) - `MPNetConfig` configuration class: `MPNetForMaskedLM` (MPNet model) - `MegaConfig` configuration class: `MegaForMaskedLM` (MEGA model) - `MegatronBertConfig` configuration class: `MegatronBertForMaskedLM` (Megatron-BERT model) - `MobileBertConfig` configuration class: `MobileBertForMaskedLM` (MobileBERT model) - `ModernBertConfig` configuration class: `ModernBertForMaskedLM` (ModernBERT model) - `MraConfig` configuration class: `MraForMaskedLM` (MRA model) - `MvpConfig` configuration class: `MvpForConditionalGeneration` (MVP model) - `NezhaConfig` configuration class: `NezhaForMaskedLM` (Nezha model) - `NystromformerConfig` configuration class: `NystromformerForMaskedLM` (Nyströmformer model) - `PerceiverConfig` configuration class: `PerceiverForMaskedLM` (Perceiver model) - `QDQBertConfig` configuration class: `QDQBertForMaskedLM` (QDQBert model) - `ReformerConfig` configuration class: `ReformerForMaskedLM` (Reformer model) - `RemBertConfig` configuration class: `RemBertForMaskedLM` (RemBERT model) - `RoCBertConfig` configuration class: `RoCBertForMaskedLM` (RoCBert model) - `RoFormerConfig` configuration class: `RoFormerForMaskedLM` (RoFormer model) - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaForMaskedLM) (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model) - `SqueezeBertConfig` configuration class: `SqueezeBertForMaskedLM` (SqueezeBERT model) - `TapasConfig` configuration class: `TapasForMaskedLM` (TAPAS model) - `Wav2Vec2Config` configuration class: `Wav2Vec2ForMaskedLM` (Wav2Vec2 model) - `XLMConfig` configuration class: `XLMWithLMHeadModel` (XLM model) - `XLMRobertaConfig` configuration class: `XLMRobertaForMaskedLM` (XLM-RoBERTa model) - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForMaskedLM` (XLM-RoBERTa-XL model) - `XmodConfig` configuration class: `XmodForMaskedLM` (X-MOD model) - `YosoConfig` configuration class: `YosoForMaskedLM` (YOSO model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForMaskedLM.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a masked language modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- `AlbertForMaskedLM` (ALBERT model)
- **bart** -- [BartForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartForConditionalGeneration) (BART model)
- **bert** -- [BertForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertForMaskedLM) (BERT model)
- **big_bird** -- `BigBirdForMaskedLM` (BigBird model)
- **camembert** -- `CamembertForMaskedLM` (CamemBERT model)
- **convbert** -- [ConvBertForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertForMaskedLM) (ConvBERT model)
- **data2vec-text** -- `Data2VecTextForMaskedLM` (Data2VecText model)
- **deberta** -- [DebertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaForMaskedLM) (DeBERTa model)
- **deberta-v2** -- [DebertaV2ForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2ForMaskedLM) (DeBERTa-v2 model)
- **distilbert** -- `DistilBertForMaskedLM` (DistilBERT model)
- **electra** -- [ElectraForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraForMaskedLM) (ELECTRA model)
- **ernie** -- `ErnieForMaskedLM` (ERNIE model)
- **esm** -- [EsmForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmForMaskedLM) (ESM model)
- **flaubert** -- `FlaubertWithLMHeadModel` (FlauBERT model)
- **fnet** -- `FNetForMaskedLM` (FNet model)
- **funnel** -- `FunnelForMaskedLM` (Funnel Transformer model)
- **ibert** -- `IBertForMaskedLM` (I-BERT model)
- **layoutlm** -- `LayoutLMForMaskedLM` (LayoutLM model)
- **longformer** -- `LongformerForMaskedLM` (Longformer model)
- **luke** -- `LukeForMaskedLM` (LUKE model)
- **mbart** -- `MBartForConditionalGeneration` (mBART model)
- **mega** -- `MegaForMaskedLM` (MEGA model)
- **megatron-bert** -- `MegatronBertForMaskedLM` (Megatron-BERT model)
- **mobilebert** -- `MobileBertForMaskedLM` (MobileBERT model)
- **modernbert** -- `ModernBertForMaskedLM` (ModernBERT model)
- **mpnet** -- `MPNetForMaskedLM` (MPNet model)
- **mra** -- `MraForMaskedLM` (MRA model)
- **mvp** -- `MvpForConditionalGeneration` (MVP model)
- **nezha** -- `NezhaForMaskedLM` (Nezha model)
- **nystromformer** -- `NystromformerForMaskedLM` (Nyströmformer model)
- **perceiver** -- `PerceiverForMaskedLM` (Perceiver model)
- **qdqbert** -- `QDQBertForMaskedLM` (QDQBert model)
- **reformer** -- `ReformerForMaskedLM` (Reformer model)
- **rembert** -- `RemBertForMaskedLM` (RemBERT model)
- **roberta** -- [RobertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaForMaskedLM) (RoBERTa model)
- **roberta-prelayernorm** -- `RobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model)
- **roc_bert** -- `RoCBertForMaskedLM` (RoCBert model)
- **roformer** -- `RoFormerForMaskedLM` (RoFormer model)
- **squeezebert** -- `SqueezeBertForMaskedLM` (SqueezeBERT model)
- **tapas** -- `TapasForMaskedLM` (TAPAS model)
- **wav2vec2** -- `Wav2Vec2ForMaskedLM` (Wav2Vec2 model)
- **xlm** -- `XLMWithLMHeadModel` (XLM model)
- **xlm-roberta** -- `XLMRobertaForMaskedLM` (XLM-RoBERTa model)
- **xlm-roberta-xl** -- `XLMRobertaXLForMaskedLM` (XLM-RoBERTa-XL model)
- **xmod** -- `XmodForMaskedLM` (X-MOD model)
- **yoso** -- `YosoForMaskedLM` (YOSO model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForMaskedLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForMaskedLM.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForMaskedLM[[transformers.TFAutoModelForMaskedLM]][[transformers.TFAutoModelForMaskedLM]]

#### transformers.TFAutoModelForMaskedLM[[transformers.TFAutoModelForMaskedLM]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L619)

This is a generic model class that will be instantiated as one of the model classes of the library (with a masked language modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForMaskedLM.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `TFAlbertForMaskedLM` (ALBERT model)
  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertForMaskedLM) (BERT model)
  - `CamembertConfig` configuration class: `TFCamembertForMaskedLM` (CamemBERT model)
  - [ConvBertConfig](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertConfig) configuration class: [TFConvBertForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.TFConvBertForMaskedLM) (ConvBERT model)
  - [DebertaConfig](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaConfig) configuration class: [TFDebertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.TFDebertaForMaskedLM) (DeBERTa model)
  - [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [TFDebertaV2ForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.TFDebertaV2ForMaskedLM) (DeBERTa-v2 model)
  - `DistilBertConfig` configuration class: `TFDistilBertForMaskedLM` (DistilBERT model)
  - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [TFElectraForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.TFElectraForMaskedLM) (ELECTRA model)
  - [EsmConfig](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmConfig) configuration class: [TFEsmForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.TFEsmForMaskedLM) (ESM model)
  - `FlaubertConfig` configuration class: `TFFlaubertWithLMHeadModel` (FlauBERT model)
  - `FunnelConfig` configuration class: `TFFunnelForMaskedLM` (Funnel Transformer model)
  - `LayoutLMConfig` configuration class: `TFLayoutLMForMaskedLM` (LayoutLM model)
  - `LongformerConfig` configuration class: `TFLongformerForMaskedLM` (Longformer model)
  - `MPNetConfig` configuration class: `TFMPNetForMaskedLM` (MPNet model)
  - `MobileBertConfig` configuration class: `TFMobileBertForMaskedLM` (MobileBERT model)
  - `RemBertConfig` configuration class: `TFRemBertForMaskedLM` (RemBERT model)
  - `RoFormerConfig` configuration class: `TFRoFormerForMaskedLM` (RoFormer model)
  - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [TFRobertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.TFRobertaForMaskedLM) (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model)
  - `TapasConfig` configuration class: `TFTapasForMaskedLM` (TAPAS model)
  - `XLMConfig` configuration class: `TFXLMWithLMHeadModel` (XLM model)
  - `XLMRobertaConfig` configuration class: `TFXLMRobertaForMaskedLM` (XLM-RoBERTa model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a masked language modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForMaskedLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForMaskedLM.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `TFAlbertForMaskedLM` (ALBERT model) - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertForMaskedLM) (BERT model) - `CamembertConfig` configuration class: `TFCamembertForMaskedLM` (CamemBERT model) - [ConvBertConfig](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertConfig) configuration class: [TFConvBertForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.TFConvBertForMaskedLM) (ConvBERT model) - [DebertaConfig](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaConfig) configuration class: [TFDebertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.TFDebertaForMaskedLM) (DeBERTa model) - [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [TFDebertaV2ForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.TFDebertaV2ForMaskedLM) (DeBERTa-v2 model) - `DistilBertConfig` configuration class: `TFDistilBertForMaskedLM` (DistilBERT model) - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [TFElectraForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.TFElectraForMaskedLM) (ELECTRA model) - [EsmConfig](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmConfig) configuration class: [TFEsmForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.TFEsmForMaskedLM) (ESM model) - `FlaubertConfig` configuration class: `TFFlaubertWithLMHeadModel` (FlauBERT model) - `FunnelConfig` configuration class: `TFFunnelForMaskedLM` (Funnel Transformer model) - `LayoutLMConfig` configuration class: `TFLayoutLMForMaskedLM` (LayoutLM model) - `LongformerConfig` configuration class: `TFLongformerForMaskedLM` (Longformer model) - `MPNetConfig` configuration class: `TFMPNetForMaskedLM` (MPNet model) - `MobileBertConfig` configuration class: `TFMobileBertForMaskedLM` (MobileBERT model) - `RemBertConfig` configuration class: `TFRemBertForMaskedLM` (RemBERT model) - `RoFormerConfig` configuration class: `TFRoFormerForMaskedLM` (RoFormer model) - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [TFRobertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.TFRobertaForMaskedLM) (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model) - `TapasConfig` configuration class: `TFTapasForMaskedLM` (TAPAS model) - `XLMConfig` configuration class: `TFXLMWithLMHeadModel` (XLM model) - `XLMRobertaConfig` configuration class: `TFXLMRobertaForMaskedLM` (XLM-RoBERTa model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForMaskedLM.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a masked language modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- `TFAlbertForMaskedLM` (ALBERT model)
- **bert** -- [TFBertForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertForMaskedLM) (BERT model)
- **camembert** -- `TFCamembertForMaskedLM` (CamemBERT model)
- **convbert** -- [TFConvBertForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.TFConvBertForMaskedLM) (ConvBERT model)
- **deberta** -- [TFDebertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.TFDebertaForMaskedLM) (DeBERTa model)
- **deberta-v2** -- [TFDebertaV2ForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.TFDebertaV2ForMaskedLM) (DeBERTa-v2 model)
- **distilbert** -- `TFDistilBertForMaskedLM` (DistilBERT model)
- **electra** -- [TFElectraForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.TFElectraForMaskedLM) (ELECTRA model)
- **esm** -- [TFEsmForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.TFEsmForMaskedLM) (ESM model)
- **flaubert** -- `TFFlaubertWithLMHeadModel` (FlauBERT model)
- **funnel** -- `TFFunnelForMaskedLM` (Funnel Transformer model)
- **layoutlm** -- `TFLayoutLMForMaskedLM` (LayoutLM model)
- **longformer** -- `TFLongformerForMaskedLM` (Longformer model)
- **mobilebert** -- `TFMobileBertForMaskedLM` (MobileBERT model)
- **mpnet** -- `TFMPNetForMaskedLM` (MPNet model)
- **rembert** -- `TFRemBertForMaskedLM` (RemBERT model)
- **roberta** -- [TFRobertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.TFRobertaForMaskedLM) (RoBERTa model)
- **roberta-prelayernorm** -- `TFRobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model)
- **roformer** -- `TFRoFormerForMaskedLM` (RoFormer model)
- **tapas** -- `TFTapasForMaskedLM` (TAPAS model)
- **xlm** -- `TFXLMWithLMHeadModel` (XLM model)
- **xlm-roberta** -- `TFXLMRobertaForMaskedLM` (XLM-RoBERTa model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForMaskedLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForMaskedLM.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### FlaxAutoModelForMaskedLM[[transformers.FlaxAutoModelForMaskedLM]][[transformers.FlaxAutoModelForMaskedLM]]

#### transformers.FlaxAutoModelForMaskedLM[[transformers.FlaxAutoModelForMaskedLM]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_flax_auto.py#L302)

This is a generic model class that will be instantiated as one of the model classes of the library (with a masked language modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.FlaxAutoModelForMaskedLM.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `FlaxAlbertForMaskedLM` (ALBERT model)
  - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.FlaxBartForConditionalGeneration) (BART model)
  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertForMaskedLM) (BERT model)
  - `BigBirdConfig` configuration class: `FlaxBigBirdForMaskedLM` (BigBird model)
  - `DistilBertConfig` configuration class: `FlaxDistilBertForMaskedLM` (DistilBERT model)
  - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [FlaxElectraForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.FlaxElectraForMaskedLM) (ELECTRA model)
  - `MBartConfig` configuration class: `FlaxMBartForConditionalGeneration` (mBART model)
  - `RoFormerConfig` configuration class: `FlaxRoFormerForMaskedLM` (RoFormer model)
  - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [FlaxRobertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.FlaxRobertaForMaskedLM) (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model)
  - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForMaskedLM` (XLM-RoBERTa model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a masked language modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForMaskedLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForMaskedLM.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `FlaxAlbertForMaskedLM` (ALBERT model) - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.FlaxBartForConditionalGeneration) (BART model) - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertForMaskedLM) (BERT model) - `BigBirdConfig` configuration class: `FlaxBigBirdForMaskedLM` (BigBird model) - `DistilBertConfig` configuration class: `FlaxDistilBertForMaskedLM` (DistilBERT model) - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [FlaxElectraForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.FlaxElectraForMaskedLM) (ELECTRA model) - `MBartConfig` configuration class: `FlaxMBartForConditionalGeneration` (mBART model) - `RoFormerConfig` configuration class: `FlaxRoFormerForMaskedLM` (RoFormer model) - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [FlaxRobertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.FlaxRobertaForMaskedLM) (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model) - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForMaskedLM` (XLM-RoBERTa model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.FlaxAutoModelForMaskedLM.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a masked language modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- `FlaxAlbertForMaskedLM` (ALBERT model)
- **bart** -- [FlaxBartForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.FlaxBartForConditionalGeneration) (BART model)
- **bert** -- [FlaxBertForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertForMaskedLM) (BERT model)
- **big_bird** -- `FlaxBigBirdForMaskedLM` (BigBird model)
- **distilbert** -- `FlaxDistilBertForMaskedLM` (DistilBERT model)
- **electra** -- [FlaxElectraForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.FlaxElectraForMaskedLM) (ELECTRA model)
- **mbart** -- `FlaxMBartForConditionalGeneration` (mBART model)
- **roberta** -- [FlaxRobertaForMaskedLM](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.FlaxRobertaForMaskedLM) (RoBERTa model)
- **roberta-prelayernorm** -- `FlaxRobertaPreLayerNormForMaskedLM` (RoBERTa-PreLayerNorm model)
- **roformer** -- `FlaxRoFormerForMaskedLM` (RoFormer model)
- **xlm-roberta** -- `FlaxXLMRobertaForMaskedLM` (XLM-RoBERTa model)

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForMaskedLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForMaskedLM.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForMaskGeneration[[transformers.AutoModelForMaskGeneration]][[transformers.AutoModelForMaskGeneration]]

#### transformers.AutoModelForMaskGeneration[[transformers.AutoModelForMaskGeneration]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L1920)

### TFAutoModelForMaskGeneration[[transformers.TFAutoModelForMaskGeneration]][[transformers.TFAutoModelForMaskGeneration]]

#### transformers.TFAutoModelForMaskGeneration[[transformers.TFAutoModelForMaskGeneration]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L530)

### AutoModelForSeq2SeqLM[[transformers.AutoModelForSeq2SeqLM]][[transformers.AutoModelForSeq2SeqLM]]

#### transformers.AutoModelForSeq2SeqLM[[transformers.AutoModelForSeq2SeqLM]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L1986)

This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence-to-sequence language modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForSeq2SeqLM.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [BartForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartForConditionalGeneration) (BART model)
  - `BigBirdPegasusConfig` configuration class: `BigBirdPegasusForConditionalGeneration` (BigBird-Pegasus model)
  - `BlenderbotConfig` configuration class: `BlenderbotForConditionalGeneration` (Blenderbot model)
  - `BlenderbotSmallConfig` configuration class: `BlenderbotSmallForConditionalGeneration` (BlenderbotSmall model)
  - [EncoderDecoderConfig](/docs/transformers/v4.57.1/ko/model_doc/encoder-decoder#transformers.EncoderDecoderConfig) configuration class: [EncoderDecoderModel](/docs/transformers/v4.57.1/ko/model_doc/encoder-decoder#transformers.EncoderDecoderModel) (Encoder decoder model)
  - `FSMTConfig` configuration class: `FSMTForConditionalGeneration` (FairSeq Machine-Translation model)
  - `GPTSanJapaneseConfig` configuration class: `GPTSanJapaneseForConditionalGeneration` (GPTSAN-japanese model)
  - `GraniteSpeechConfig` configuration class: `GraniteSpeechForConditionalGeneration` (GraniteSpeech model)
  - `LEDConfig` configuration class: `LEDForConditionalGeneration` (LED model)
  - `LongT5Config` configuration class: `LongT5ForConditionalGeneration` (LongT5 model)
  - `M2M100Config` configuration class: `M2M100ForConditionalGeneration` (M2M100 model)
  - `MBartConfig` configuration class: `MBartForConditionalGeneration` (mBART model)
  - `MT5Config` configuration class: `MT5ForConditionalGeneration` (MT5 model)
  - [MarianConfig](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.MarianConfig) configuration class: [MarianMTModel](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.MarianMTModel) (Marian model)
  - `MvpConfig` configuration class: `MvpForConditionalGeneration` (MVP model)
  - `NllbMoeConfig` configuration class: `NllbMoeForConditionalGeneration` (NLLB-MOE model)
  - `PLBartConfig` configuration class: `PLBartForConditionalGeneration` (PLBart model)
  - `PegasusConfig` configuration class: `PegasusForConditionalGeneration` (Pegasus model)
  - `PegasusXConfig` configuration class: `PegasusXForConditionalGeneration` (PEGASUS-X model)
  - `ProphetNetConfig` configuration class: `ProphetNetForConditionalGeneration` (ProphetNet model)
  - `Qwen2AudioConfig` configuration class: `Qwen2AudioForConditionalGeneration` (Qwen2Audio model)
  - `SeamlessM4TConfig` configuration class: `SeamlessM4TForTextToText` (SeamlessM4T model)
  - `SeamlessM4Tv2Config` configuration class: `SeamlessM4Tv2ForTextToText` (SeamlessM4Tv2 model)
  - `SwitchTransformersConfig` configuration class: `SwitchTransformersForConditionalGeneration` (SwitchTransformers model)
  - `T5Config` configuration class: `T5ForConditionalGeneration` (T5 model)
  - `T5GemmaConfig` configuration class: `T5GemmaForConditionalGeneration` (T5Gemma model)
  - `UMT5Config` configuration class: `UMT5ForConditionalGeneration` (UMT5 model)
  - `VoxtralConfig` configuration class: `VoxtralForConditionalGeneration` (Voxtral model)
  - `XLMProphetNetConfig` configuration class: `XLMProphetNetForConditionalGeneration` (XLM-ProphetNet model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a sequence-to-sequence language modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForSeq2SeqLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-t5/t5-base")
>>> model = AutoModelForSeq2SeqLM.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [BartForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartForConditionalGeneration) (BART model) - `BigBirdPegasusConfig` configuration class: `BigBirdPegasusForConditionalGeneration` (BigBird-Pegasus model) - `BlenderbotConfig` configuration class: `BlenderbotForConditionalGeneration` (Blenderbot model) - `BlenderbotSmallConfig` configuration class: `BlenderbotSmallForConditionalGeneration` (BlenderbotSmall model) - [EncoderDecoderConfig](/docs/transformers/v4.57.1/ko/model_doc/encoder-decoder#transformers.EncoderDecoderConfig) configuration class: [EncoderDecoderModel](/docs/transformers/v4.57.1/ko/model_doc/encoder-decoder#transformers.EncoderDecoderModel) (Encoder decoder model) - `FSMTConfig` configuration class: `FSMTForConditionalGeneration` (FairSeq Machine-Translation model) - `GPTSanJapaneseConfig` configuration class: `GPTSanJapaneseForConditionalGeneration` (GPTSAN-japanese model) - `GraniteSpeechConfig` configuration class: `GraniteSpeechForConditionalGeneration` (GraniteSpeech model) - `LEDConfig` configuration class: `LEDForConditionalGeneration` (LED model) - `LongT5Config` configuration class: `LongT5ForConditionalGeneration` (LongT5 model) - `M2M100Config` configuration class: `M2M100ForConditionalGeneration` (M2M100 model) - `MBartConfig` configuration class: `MBartForConditionalGeneration` (mBART model) - `MT5Config` configuration class: `MT5ForConditionalGeneration` (MT5 model) - [MarianConfig](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.MarianConfig) configuration class: [MarianMTModel](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.MarianMTModel) (Marian model) - `MvpConfig` configuration class: `MvpForConditionalGeneration` (MVP model) - `NllbMoeConfig` configuration class: `NllbMoeForConditionalGeneration` (NLLB-MOE model) - `PLBartConfig` configuration class: `PLBartForConditionalGeneration` (PLBart model) - `PegasusConfig` configuration class: `PegasusForConditionalGeneration` (Pegasus model) - `PegasusXConfig` configuration class: `PegasusXForConditionalGeneration` (PEGASUS-X model) - `ProphetNetConfig` configuration class: `ProphetNetForConditionalGeneration` (ProphetNet model) - `Qwen2AudioConfig` configuration class: `Qwen2AudioForConditionalGeneration` (Qwen2Audio model) - `SeamlessM4TConfig` configuration class: `SeamlessM4TForTextToText` (SeamlessM4T model) - `SeamlessM4Tv2Config` configuration class: `SeamlessM4Tv2ForTextToText` (SeamlessM4Tv2 model) - `SwitchTransformersConfig` configuration class: `SwitchTransformersForConditionalGeneration` (SwitchTransformers model) - `T5Config` configuration class: `T5ForConditionalGeneration` (T5 model) - `T5GemmaConfig` configuration class: `T5GemmaForConditionalGeneration` (T5Gemma model) - `UMT5Config` configuration class: `UMT5ForConditionalGeneration` (UMT5 model) - `VoxtralConfig` configuration class: `VoxtralForConditionalGeneration` (Voxtral model) - `XLMProphetNetConfig` configuration class: `XLMProphetNetForConditionalGeneration` (XLM-ProphetNet model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForSeq2SeqLM.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a sequence-to-sequence language modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **bart** -- [BartForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartForConditionalGeneration) (BART model)
- **bigbird_pegasus** -- `BigBirdPegasusForConditionalGeneration` (BigBird-Pegasus model)
- **blenderbot** -- `BlenderbotForConditionalGeneration` (Blenderbot model)
- **blenderbot-small** -- `BlenderbotSmallForConditionalGeneration` (BlenderbotSmall model)
- **encoder-decoder** -- [EncoderDecoderModel](/docs/transformers/v4.57.1/ko/model_doc/encoder-decoder#transformers.EncoderDecoderModel) (Encoder decoder model)
- **fsmt** -- `FSMTForConditionalGeneration` (FairSeq Machine-Translation model)
- **gptsan-japanese** -- `GPTSanJapaneseForConditionalGeneration` (GPTSAN-japanese model)
- **granite_speech** -- `GraniteSpeechForConditionalGeneration` (GraniteSpeech model)
- **led** -- `LEDForConditionalGeneration` (LED model)
- **longt5** -- `LongT5ForConditionalGeneration` (LongT5 model)
- **m2m_100** -- `M2M100ForConditionalGeneration` (M2M100 model)
- **marian** -- [MarianMTModel](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.MarianMTModel) (Marian model)
- **mbart** -- `MBartForConditionalGeneration` (mBART model)
- **mt5** -- `MT5ForConditionalGeneration` (MT5 model)
- **mvp** -- `MvpForConditionalGeneration` (MVP model)
- **nllb-moe** -- `NllbMoeForConditionalGeneration` (NLLB-MOE model)
- **pegasus** -- `PegasusForConditionalGeneration` (Pegasus model)
- **pegasus_x** -- `PegasusXForConditionalGeneration` (PEGASUS-X model)
- **plbart** -- `PLBartForConditionalGeneration` (PLBart model)
- **prophetnet** -- `ProphetNetForConditionalGeneration` (ProphetNet model)
- **qwen2_audio** -- `Qwen2AudioForConditionalGeneration` (Qwen2Audio model)
- **seamless_m4t** -- `SeamlessM4TForTextToText` (SeamlessM4T model)
- **seamless_m4t_v2** -- `SeamlessM4Tv2ForTextToText` (SeamlessM4Tv2 model)
- **switch_transformers** -- `SwitchTransformersForConditionalGeneration` (SwitchTransformers model)
- **t5** -- `T5ForConditionalGeneration` (T5 model)
- **t5gemma** -- `T5GemmaForConditionalGeneration` (T5Gemma model)
- **umt5** -- `UMT5ForConditionalGeneration` (UMT5 model)
- **voxtral** -- `VoxtralForConditionalGeneration` (Voxtral model)
- **xlm-prophetnet** -- `XLMProphetNetForConditionalGeneration` (XLM-ProphetNet model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForSeq2SeqLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base")

>>> # Update configuration during loading
>>> model = AutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/t5_tf_model_config.json")
>>> model = AutoModelForSeq2SeqLM.from_pretrained(
...     "./tf_model/t5_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForSeq2SeqLM[[transformers.TFAutoModelForSeq2SeqLM]][[transformers.TFAutoModelForSeq2SeqLM]]

#### transformers.TFAutoModelForSeq2SeqLM[[transformers.TFAutoModelForSeq2SeqLM]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L626)

This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence-to-sequence language modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForSeq2SeqLM.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [TFBartForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.TFBartForConditionalGeneration) (BART model)
  - `BlenderbotConfig` configuration class: `TFBlenderbotForConditionalGeneration` (Blenderbot model)
  - `BlenderbotSmallConfig` configuration class: `TFBlenderbotSmallForConditionalGeneration` (BlenderbotSmall model)
  - [EncoderDecoderConfig](/docs/transformers/v4.57.1/ko/model_doc/encoder-decoder#transformers.EncoderDecoderConfig) configuration class: [TFEncoderDecoderModel](/docs/transformers/v4.57.1/ko/model_doc/encoder-decoder#transformers.TFEncoderDecoderModel) (Encoder decoder model)
  - `LEDConfig` configuration class: `TFLEDForConditionalGeneration` (LED model)
  - `MBartConfig` configuration class: `TFMBartForConditionalGeneration` (mBART model)
  - `MT5Config` configuration class: `TFMT5ForConditionalGeneration` (MT5 model)
  - [MarianConfig](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.MarianConfig) configuration class: [TFMarianMTModel](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.TFMarianMTModel) (Marian model)
  - `PegasusConfig` configuration class: `TFPegasusForConditionalGeneration` (Pegasus model)
  - `T5Config` configuration class: `TFT5ForConditionalGeneration` (T5 model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a sequence-to-sequence language modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForSeq2SeqLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-t5/t5-base")
>>> model = TFAutoModelForSeq2SeqLM.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [TFBartForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.TFBartForConditionalGeneration) (BART model) - `BlenderbotConfig` configuration class: `TFBlenderbotForConditionalGeneration` (Blenderbot model) - `BlenderbotSmallConfig` configuration class: `TFBlenderbotSmallForConditionalGeneration` (BlenderbotSmall model) - [EncoderDecoderConfig](/docs/transformers/v4.57.1/ko/model_doc/encoder-decoder#transformers.EncoderDecoderConfig) configuration class: [TFEncoderDecoderModel](/docs/transformers/v4.57.1/ko/model_doc/encoder-decoder#transformers.TFEncoderDecoderModel) (Encoder decoder model) - `LEDConfig` configuration class: `TFLEDForConditionalGeneration` (LED model) - `MBartConfig` configuration class: `TFMBartForConditionalGeneration` (mBART model) - `MT5Config` configuration class: `TFMT5ForConditionalGeneration` (MT5 model) - [MarianConfig](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.MarianConfig) configuration class: [TFMarianMTModel](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.TFMarianMTModel) (Marian model) - `PegasusConfig` configuration class: `TFPegasusForConditionalGeneration` (Pegasus model) - `T5Config` configuration class: `TFT5ForConditionalGeneration` (T5 model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForSeq2SeqLM.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a sequence-to-sequence language modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **bart** -- [TFBartForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.TFBartForConditionalGeneration) (BART model)
- **blenderbot** -- `TFBlenderbotForConditionalGeneration` (Blenderbot model)
- **blenderbot-small** -- `TFBlenderbotSmallForConditionalGeneration` (BlenderbotSmall model)
- **encoder-decoder** -- [TFEncoderDecoderModel](/docs/transformers/v4.57.1/ko/model_doc/encoder-decoder#transformers.TFEncoderDecoderModel) (Encoder decoder model)
- **led** -- `TFLEDForConditionalGeneration` (LED model)
- **marian** -- [TFMarianMTModel](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.TFMarianMTModel) (Marian model)
- **mbart** -- `TFMBartForConditionalGeneration` (mBART model)
- **mt5** -- `TFMT5ForConditionalGeneration` (MT5 model)
- **pegasus** -- `TFPegasusForConditionalGeneration` (Pegasus model)
- **t5** -- `TFT5ForConditionalGeneration` (T5 model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForSeq2SeqLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base")

>>> # Update configuration during loading
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/t5_pt_model_config.json")
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained(
...     "./pt_model/t5_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### FlaxAutoModelForSeq2SeqLM[[transformers.FlaxAutoModelForSeq2SeqLM]][[transformers.FlaxAutoModelForSeq2SeqLM]]

#### transformers.FlaxAutoModelForSeq2SeqLM[[transformers.FlaxAutoModelForSeq2SeqLM]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_flax_auto.py#L309)

This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence-to-sequence language modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.FlaxAutoModelForSeq2SeqLM.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.FlaxBartForConditionalGeneration) (BART model)
  - `BlenderbotConfig` configuration class: `FlaxBlenderbotForConditionalGeneration` (Blenderbot model)
  - `BlenderbotSmallConfig` configuration class: `FlaxBlenderbotSmallForConditionalGeneration` (BlenderbotSmall model)
  - [EncoderDecoderConfig](/docs/transformers/v4.57.1/ko/model_doc/encoder-decoder#transformers.EncoderDecoderConfig) configuration class: [FlaxEncoderDecoderModel](/docs/transformers/v4.57.1/ko/model_doc/encoder-decoder#transformers.FlaxEncoderDecoderModel) (Encoder decoder model)
  - `LongT5Config` configuration class: `FlaxLongT5ForConditionalGeneration` (LongT5 model)
  - `MBartConfig` configuration class: `FlaxMBartForConditionalGeneration` (mBART model)
  - `MT5Config` configuration class: `FlaxMT5ForConditionalGeneration` (MT5 model)
  - [MarianConfig](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.MarianConfig) configuration class: [FlaxMarianMTModel](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.FlaxMarianMTModel) (Marian model)
  - `PegasusConfig` configuration class: `FlaxPegasusForConditionalGeneration` (Pegasus model)
  - `T5Config` configuration class: `FlaxT5ForConditionalGeneration` (T5 model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a sequence-to-sequence language modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForSeq2SeqLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-t5/t5-base")
>>> model = FlaxAutoModelForSeq2SeqLM.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.FlaxBartForConditionalGeneration) (BART model) - `BlenderbotConfig` configuration class: `FlaxBlenderbotForConditionalGeneration` (Blenderbot model) - `BlenderbotSmallConfig` configuration class: `FlaxBlenderbotSmallForConditionalGeneration` (BlenderbotSmall model) - [EncoderDecoderConfig](/docs/transformers/v4.57.1/ko/model_doc/encoder-decoder#transformers.EncoderDecoderConfig) configuration class: [FlaxEncoderDecoderModel](/docs/transformers/v4.57.1/ko/model_doc/encoder-decoder#transformers.FlaxEncoderDecoderModel) (Encoder decoder model) - `LongT5Config` configuration class: `FlaxLongT5ForConditionalGeneration` (LongT5 model) - `MBartConfig` configuration class: `FlaxMBartForConditionalGeneration` (mBART model) - `MT5Config` configuration class: `FlaxMT5ForConditionalGeneration` (MT5 model) - [MarianConfig](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.MarianConfig) configuration class: [FlaxMarianMTModel](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.FlaxMarianMTModel) (Marian model) - `PegasusConfig` configuration class: `FlaxPegasusForConditionalGeneration` (Pegasus model) - `T5Config` configuration class: `FlaxT5ForConditionalGeneration` (T5 model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.FlaxAutoModelForSeq2SeqLM.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a sequence-to-sequence language modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **bart** -- [FlaxBartForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.FlaxBartForConditionalGeneration) (BART model)
- **blenderbot** -- `FlaxBlenderbotForConditionalGeneration` (Blenderbot model)
- **blenderbot-small** -- `FlaxBlenderbotSmallForConditionalGeneration` (BlenderbotSmall model)
- **encoder-decoder** -- [FlaxEncoderDecoderModel](/docs/transformers/v4.57.1/ko/model_doc/encoder-decoder#transformers.FlaxEncoderDecoderModel) (Encoder decoder model)
- **longt5** -- `FlaxLongT5ForConditionalGeneration` (LongT5 model)
- **marian** -- [FlaxMarianMTModel](/docs/transformers/v4.57.1/ko/model_doc/marian#transformers.FlaxMarianMTModel) (Marian model)
- **mbart** -- `FlaxMBartForConditionalGeneration` (mBART model)
- **mt5** -- `FlaxMT5ForConditionalGeneration` (MT5 model)
- **pegasus** -- `FlaxPegasusForConditionalGeneration` (Pegasus model)
- **t5** -- `FlaxT5ForConditionalGeneration` (T5 model)

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForSeq2SeqLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/t5_pt_model_config.json")
>>> model = FlaxAutoModelForSeq2SeqLM.from_pretrained(
...     "./pt_model/t5_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForSequenceClassification[[transformers.AutoModelForSequenceClassification]][[transformers.AutoModelForSequenceClassification]]

#### transformers.AutoModelForSequenceClassification[[transformers.AutoModelForSequenceClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L1997)

This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForSequenceClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `AlbertForSequenceClassification` (ALBERT model)
  - `ArceeConfig` configuration class: `ArceeForSequenceClassification` (Arcee model)
  - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [BartForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartForSequenceClassification) (BART model)
  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [BertForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertForSequenceClassification) (BERT model)
  - `BigBirdConfig` configuration class: `BigBirdForSequenceClassification` (BigBird model)
  - `BigBirdPegasusConfig` configuration class: `BigBirdPegasusForSequenceClassification` (BigBird-Pegasus model)
  - [BioGptConfig](/docs/transformers/v4.57.1/ko/model_doc/biogpt#transformers.BioGptConfig) configuration class: [BioGptForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/biogpt#transformers.BioGptForSequenceClassification) (BioGpt model)
  - `BloomConfig` configuration class: `BloomForSequenceClassification` (BLOOM model)
  - `CTRLConfig` configuration class: `CTRLForSequenceClassification` (CTRL model)
  - `CamembertConfig` configuration class: `CamembertForSequenceClassification` (CamemBERT model)
  - `CanineConfig` configuration class: `CanineForSequenceClassification` (CANINE model)
  - [ConvBertConfig](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertForSequenceClassification) (ConvBERT model)
  - `Data2VecTextConfig` configuration class: `Data2VecTextForSequenceClassification` (Data2VecText model)
  - [DebertaConfig](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaForSequenceClassification) (DeBERTa model)
  - [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2ForSequenceClassification) (DeBERTa-v2 model)
  - `DeepseekV2Config` configuration class: `DeepseekV2ForSequenceClassification` (DeepSeek-V2 model)
  - [DeepseekV3Config](/docs/transformers/v4.57.1/ko/model_doc/deepseek_v3#transformers.DeepseekV3Config) configuration class: `DeepseekV3ForSequenceClassification` (DeepSeek-V3 model)
  - `DiffLlamaConfig` configuration class: `DiffLlamaForSequenceClassification` (DiffLlama model)
  - `DistilBertConfig` configuration class: `DistilBertForSequenceClassification` (DistilBERT model)
  - `DogeConfig` configuration class: `DogeForSequenceClassification` (Doge model)
  - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraForSequenceClassification) (ELECTRA model)
  - `ErnieConfig` configuration class: `ErnieForSequenceClassification` (ERNIE model)
  - `ErnieMConfig` configuration class: `ErnieMForSequenceClassification` (ErnieM model)
  - [EsmConfig](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmConfig) configuration class: [EsmForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmForSequenceClassification) (ESM model)
  - [Exaone4Config](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4Config) configuration class: [Exaone4ForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4ForSequenceClassification) (EXAONE-4.0 model)
  - `FNetConfig` configuration class: `FNetForSequenceClassification` (FNet model)
  - `FalconConfig` configuration class: `FalconForSequenceClassification` (Falcon model)
  - `FlaubertConfig` configuration class: `FlaubertForSequenceClassification` (FlauBERT model)
  - `FunnelConfig` configuration class: `FunnelForSequenceClassification` (Funnel Transformer model)
  - [GPT2Config](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Config) configuration class: [GPT2ForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2ForSequenceClassification) (OpenAI GPT-2 model)
  - `GPTBigCodeConfig` configuration class: `GPTBigCodeForSequenceClassification` (GPTBigCode model)
  - `GPTJConfig` configuration class: `GPTJForSequenceClassification` (GPT-J model)
  - `GPTNeoConfig` configuration class: `GPTNeoForSequenceClassification` (GPT Neo model)
  - `GPTNeoXConfig` configuration class: `GPTNeoXForSequenceClassification` (GPT NeoX model)
  - [Gemma2Config](/docs/transformers/v4.57.1/ko/model_doc/gemma2#transformers.Gemma2Config) configuration class: [Gemma2ForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/gemma2#transformers.Gemma2ForSequenceClassification) (Gemma2 model)
  - [Gemma3Config](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3Config) configuration class: [Gemma3ForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3ForSequenceClassification) (Gemma3ForConditionalGeneration model)
  - [Gemma3TextConfig](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3TextConfig) configuration class: `Gemma3TextForSequenceClassification` (Gemma3ForCausalLM model)
  - [GemmaConfig](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaConfig) configuration class: [GemmaForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaForSequenceClassification) (Gemma model)
  - `Glm4Config` configuration class: `Glm4ForSequenceClassification` (GLM4 model)
  - `GlmConfig` configuration class: `GlmForSequenceClassification` (GLM model)
  - `GptOssConfig` configuration class: `GptOssForSequenceClassification` (GptOss model)
  - `HeliumConfig` configuration class: `HeliumForSequenceClassification` (Helium model)
  - `HunYuanDenseV1Config` configuration class: `HunYuanDenseV1ForSequenceClassification` (HunYuanDenseV1 model)
  - `HunYuanMoEV1Config` configuration class: `HunYuanMoEV1ForSequenceClassification` (HunYuanMoeV1 model)
  - `IBertConfig` configuration class: `IBertForSequenceClassification` (I-BERT model)
  - [JambaConfig](/docs/transformers/v4.57.1/ko/model_doc/jamba#transformers.JambaConfig) configuration class: [JambaForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/jamba#transformers.JambaForSequenceClassification) (Jamba model)
  - `JetMoeConfig` configuration class: `JetMoeForSequenceClassification` (JetMoe model)
  - `LEDConfig` configuration class: `LEDForSequenceClassification` (LED model)
  - `LayoutLMConfig` configuration class: `LayoutLMForSequenceClassification` (LayoutLM model)
  - `LayoutLMv2Config` configuration class: `LayoutLMv2ForSequenceClassification` (LayoutLMv2 model)
  - `LayoutLMv3Config` configuration class: `LayoutLMv3ForSequenceClassification` (LayoutLMv3 model)
  - `LiltConfig` configuration class: `LiltForSequenceClassification` (LiLT model)
  - [LlamaConfig](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaConfig) configuration class: [LlamaForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaForSequenceClassification) (LLaMA model)
  - `LongformerConfig` configuration class: `LongformerForSequenceClassification` (Longformer model)
  - `LukeConfig` configuration class: `LukeForSequenceClassification` (LUKE model)
  - `MBartConfig` configuration class: `MBartForSequenceClassification` (mBART model)
  - `MPNetConfig` configuration class: `MPNetForSequenceClassification` (MPNet model)
  - `MT5Config` configuration class: `MT5ForSequenceClassification` (MT5 model)
  - `MarkupLMConfig` configuration class: `MarkupLMForSequenceClassification` (MarkupLM model)
  - `MegaConfig` configuration class: `MegaForSequenceClassification` (MEGA model)
  - `MegatronBertConfig` configuration class: `MegatronBertForSequenceClassification` (Megatron-BERT model)
  - `MiniMaxConfig` configuration class: `MiniMaxForSequenceClassification` (MiniMax model)
  - `MinistralConfig` configuration class: `MinistralForSequenceClassification` (Ministral model)
  - [MistralConfig](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralConfig) configuration class: [MistralForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralForSequenceClassification) (Mistral model)
  - `MixtralConfig` configuration class: `MixtralForSequenceClassification` (Mixtral model)
  - `MobileBertConfig` configuration class: `MobileBertForSequenceClassification` (MobileBERT model)
  - `ModernBertConfig` configuration class: `ModernBertForSequenceClassification` (ModernBERT model)
  - `ModernBertDecoderConfig` configuration class: `ModernBertDecoderForSequenceClassification` (ModernBertDecoder model)
  - `MptConfig` configuration class: `MptForSequenceClassification` (MPT model)
  - `MraConfig` configuration class: `MraForSequenceClassification` (MRA model)
  - `MvpConfig` configuration class: `MvpForSequenceClassification` (MVP model)
  - `NemotronConfig` configuration class: `NemotronForSequenceClassification` (Nemotron model)
  - `NezhaConfig` configuration class: `NezhaForSequenceClassification` (Nezha model)
  - `NystromformerConfig` configuration class: `NystromformerForSequenceClassification` (Nyströmformer model)
  - `OPTConfig` configuration class: `OPTForSequenceClassification` (OPT model)
  - [OpenAIGPTConfig](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTConfig) configuration class: [OpenAIGPTForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTForSequenceClassification) (OpenAI GPT model)
  - `OpenLlamaConfig` configuration class: `OpenLlamaForSequenceClassification` (OpenLlama model)
  - `PLBartConfig` configuration class: `PLBartForSequenceClassification` (PLBart model)
  - `PerceiverConfig` configuration class: `PerceiverForSequenceClassification` (Perceiver model)
  - `PersimmonConfig` configuration class: `PersimmonForSequenceClassification` (Persimmon model)
  - `Phi3Config` configuration class: `Phi3ForSequenceClassification` (Phi3 model)
  - `PhiConfig` configuration class: `PhiForSequenceClassification` (Phi model)
  - `PhimoeConfig` configuration class: `PhimoeForSequenceClassification` (Phimoe model)
  - `QDQBertConfig` configuration class: `QDQBertForSequenceClassification` (QDQBert model)
  - `Qwen2Config` configuration class: `Qwen2ForSequenceClassification` (Qwen2 model)
  - `Qwen2MoeConfig` configuration class: `Qwen2MoeForSequenceClassification` (Qwen2MoE model)
  - `Qwen3Config` configuration class: `Qwen3ForSequenceClassification` (Qwen3 model)
  - `Qwen3MoeConfig` configuration class: `Qwen3MoeForSequenceClassification` (Qwen3MoE model)
  - `Qwen3NextConfig` configuration class: `Qwen3NextForSequenceClassification` (Qwen3Next model)
  - `ReformerConfig` configuration class: `ReformerForSequenceClassification` (Reformer model)
  - `RemBertConfig` configuration class: `RemBertForSequenceClassification` (RemBERT model)
  - `RoCBertConfig` configuration class: `RoCBertForSequenceClassification` (RoCBert model)
  - `RoFormerConfig` configuration class: `RoFormerForSequenceClassification` (RoFormer model)
  - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaForSequenceClassification) (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForSequenceClassification` (RoBERTa-PreLayerNorm model)
  - `SeedOssConfig` configuration class: `SeedOssForSequenceClassification` (SeedOss model)
  - `SmolLM3Config` configuration class: `SmolLM3ForSequenceClassification` (SmolLM3 model)
  - `SqueezeBertConfig` configuration class: `SqueezeBertForSequenceClassification` (SqueezeBERT model)
  - `StableLmConfig` configuration class: `StableLmForSequenceClassification` (StableLm model)
  - `Starcoder2Config` configuration class: `Starcoder2ForSequenceClassification` (Starcoder2 model)
  - `T5Config` configuration class: `T5ForSequenceClassification` (T5 model)
  - `T5GemmaConfig` configuration class: `T5GemmaForSequenceClassification` (T5Gemma model)
  - `TapasConfig` configuration class: `TapasForSequenceClassification` (TAPAS model)
  - `TransfoXLConfig` configuration class: `TransfoXLForSequenceClassification` (Transformer-XL model)
  - `UMT5Config` configuration class: `UMT5ForSequenceClassification` (UMT5 model)
  - `XLMConfig` configuration class: `XLMForSequenceClassification` (XLM model)
  - `XLMRobertaConfig` configuration class: `XLMRobertaForSequenceClassification` (XLM-RoBERTa model)
  - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForSequenceClassification` (XLM-RoBERTa-XL model)
  - `XLNetConfig` configuration class: `XLNetForSequenceClassification` (XLNet model)
  - `XmodConfig` configuration class: `XmodForSequenceClassification` (X-MOD model)
  - `YosoConfig` configuration class: `YosoForSequenceClassification` (YOSO model)
  - `Zamba2Config` configuration class: `Zamba2ForSequenceClassification` (Zamba2 model)
  - `ZambaConfig` configuration class: `ZambaForSequenceClassification` (Zamba model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a sequence classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForSequenceClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForSequenceClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `AlbertForSequenceClassification` (ALBERT model) - `ArceeConfig` configuration class: `ArceeForSequenceClassification` (Arcee model) - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [BartForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartForSequenceClassification) (BART model) - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [BertForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertForSequenceClassification) (BERT model) - `BigBirdConfig` configuration class: `BigBirdForSequenceClassification` (BigBird model) - `BigBirdPegasusConfig` configuration class: `BigBirdPegasusForSequenceClassification` (BigBird-Pegasus model) - [BioGptConfig](/docs/transformers/v4.57.1/ko/model_doc/biogpt#transformers.BioGptConfig) configuration class: [BioGptForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/biogpt#transformers.BioGptForSequenceClassification) (BioGpt model) - `BloomConfig` configuration class: `BloomForSequenceClassification` (BLOOM model) - `CTRLConfig` configuration class: `CTRLForSequenceClassification` (CTRL model) - `CamembertConfig` configuration class: `CamembertForSequenceClassification` (CamemBERT model) - `CanineConfig` configuration class: `CanineForSequenceClassification` (CANINE model) - [ConvBertConfig](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertForSequenceClassification) (ConvBERT model) - `Data2VecTextConfig` configuration class: `Data2VecTextForSequenceClassification` (Data2VecText model) - [DebertaConfig](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaForSequenceClassification) (DeBERTa model) - [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2ForSequenceClassification) (DeBERTa-v2 model) - `DeepseekV2Config` configuration class: `DeepseekV2ForSequenceClassification` (DeepSeek-V2 model) - [DeepseekV3Config](/docs/transformers/v4.57.1/ko/model_doc/deepseek_v3#transformers.DeepseekV3Config) configuration class: `DeepseekV3ForSequenceClassification` (DeepSeek-V3 model) - `DiffLlamaConfig` configuration class: `DiffLlamaForSequenceClassification` (DiffLlama model) - `DistilBertConfig` configuration class: `DistilBertForSequenceClassification` (DistilBERT model) - `DogeConfig` configuration class: `DogeForSequenceClassification` (Doge model) - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraForSequenceClassification) (ELECTRA model) - `ErnieConfig` configuration class: `ErnieForSequenceClassification` (ERNIE model) - `ErnieMConfig` configuration class: `ErnieMForSequenceClassification` (ErnieM model) - [EsmConfig](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmConfig) configuration class: [EsmForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmForSequenceClassification) (ESM model) - [Exaone4Config](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4Config) configuration class: [Exaone4ForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4ForSequenceClassification) (EXAONE-4.0 model) - `FNetConfig` configuration class: `FNetForSequenceClassification` (FNet model) - `FalconConfig` configuration class: `FalconForSequenceClassification` (Falcon model) - `FlaubertConfig` configuration class: `FlaubertForSequenceClassification` (FlauBERT model) - `FunnelConfig` configuration class: `FunnelForSequenceClassification` (Funnel Transformer model) - [GPT2Config](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Config) configuration class: [GPT2ForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2ForSequenceClassification) (OpenAI GPT-2 model) - `GPTBigCodeConfig` configuration class: `GPTBigCodeForSequenceClassification` (GPTBigCode model) - `GPTJConfig` configuration class: `GPTJForSequenceClassification` (GPT-J model) - `GPTNeoConfig` configuration class: `GPTNeoForSequenceClassification` (GPT Neo model) - `GPTNeoXConfig` configuration class: `GPTNeoXForSequenceClassification` (GPT NeoX model) - [Gemma2Config](/docs/transformers/v4.57.1/ko/model_doc/gemma2#transformers.Gemma2Config) configuration class: [Gemma2ForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/gemma2#transformers.Gemma2ForSequenceClassification) (Gemma2 model) - [Gemma3Config](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3Config) configuration class: [Gemma3ForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3ForSequenceClassification) (Gemma3ForConditionalGeneration model) - [Gemma3TextConfig](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3TextConfig) configuration class: `Gemma3TextForSequenceClassification` (Gemma3ForCausalLM model) - [GemmaConfig](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaConfig) configuration class: [GemmaForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaForSequenceClassification) (Gemma model) - `Glm4Config` configuration class: `Glm4ForSequenceClassification` (GLM4 model) - `GlmConfig` configuration class: `GlmForSequenceClassification` (GLM model) - `GptOssConfig` configuration class: `GptOssForSequenceClassification` (GptOss model) - `HeliumConfig` configuration class: `HeliumForSequenceClassification` (Helium model) - `HunYuanDenseV1Config` configuration class: `HunYuanDenseV1ForSequenceClassification` (HunYuanDenseV1 model) - `HunYuanMoEV1Config` configuration class: `HunYuanMoEV1ForSequenceClassification` (HunYuanMoeV1 model) - `IBertConfig` configuration class: `IBertForSequenceClassification` (I-BERT model) - [JambaConfig](/docs/transformers/v4.57.1/ko/model_doc/jamba#transformers.JambaConfig) configuration class: [JambaForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/jamba#transformers.JambaForSequenceClassification) (Jamba model) - `JetMoeConfig` configuration class: `JetMoeForSequenceClassification` (JetMoe model) - `LEDConfig` configuration class: `LEDForSequenceClassification` (LED model) - `LayoutLMConfig` configuration class: `LayoutLMForSequenceClassification` (LayoutLM model) - `LayoutLMv2Config` configuration class: `LayoutLMv2ForSequenceClassification` (LayoutLMv2 model) - `LayoutLMv3Config` configuration class: `LayoutLMv3ForSequenceClassification` (LayoutLMv3 model) - `LiltConfig` configuration class: `LiltForSequenceClassification` (LiLT model) - [LlamaConfig](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaConfig) configuration class: [LlamaForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaForSequenceClassification) (LLaMA model) - `LongformerConfig` configuration class: `LongformerForSequenceClassification` (Longformer model) - `LukeConfig` configuration class: `LukeForSequenceClassification` (LUKE model) - `MBartConfig` configuration class: `MBartForSequenceClassification` (mBART model) - `MPNetConfig` configuration class: `MPNetForSequenceClassification` (MPNet model) - `MT5Config` configuration class: `MT5ForSequenceClassification` (MT5 model) - `MarkupLMConfig` configuration class: `MarkupLMForSequenceClassification` (MarkupLM model) - `MegaConfig` configuration class: `MegaForSequenceClassification` (MEGA model) - `MegatronBertConfig` configuration class: `MegatronBertForSequenceClassification` (Megatron-BERT model) - `MiniMaxConfig` configuration class: `MiniMaxForSequenceClassification` (MiniMax model) - `MinistralConfig` configuration class: `MinistralForSequenceClassification` (Ministral model) - [MistralConfig](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralConfig) configuration class: [MistralForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralForSequenceClassification) (Mistral model) - `MixtralConfig` configuration class: `MixtralForSequenceClassification` (Mixtral model) - `MobileBertConfig` configuration class: `MobileBertForSequenceClassification` (MobileBERT model) - `ModernBertConfig` configuration class: `ModernBertForSequenceClassification` (ModernBERT model) - `ModernBertDecoderConfig` configuration class: `ModernBertDecoderForSequenceClassification` (ModernBertDecoder model) - `MptConfig` configuration class: `MptForSequenceClassification` (MPT model) - `MraConfig` configuration class: `MraForSequenceClassification` (MRA model) - `MvpConfig` configuration class: `MvpForSequenceClassification` (MVP model) - `NemotronConfig` configuration class: `NemotronForSequenceClassification` (Nemotron model) - `NezhaConfig` configuration class: `NezhaForSequenceClassification` (Nezha model) - `NystromformerConfig` configuration class: `NystromformerForSequenceClassification` (Nyströmformer model) - `OPTConfig` configuration class: `OPTForSequenceClassification` (OPT model) - [OpenAIGPTConfig](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTConfig) configuration class: [OpenAIGPTForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTForSequenceClassification) (OpenAI GPT model) - `OpenLlamaConfig` configuration class: `OpenLlamaForSequenceClassification` (OpenLlama model) - `PLBartConfig` configuration class: `PLBartForSequenceClassification` (PLBart model) - `PerceiverConfig` configuration class: `PerceiverForSequenceClassification` (Perceiver model) - `PersimmonConfig` configuration class: `PersimmonForSequenceClassification` (Persimmon model) - `Phi3Config` configuration class: `Phi3ForSequenceClassification` (Phi3 model) - `PhiConfig` configuration class: `PhiForSequenceClassification` (Phi model) - `PhimoeConfig` configuration class: `PhimoeForSequenceClassification` (Phimoe model) - `QDQBertConfig` configuration class: `QDQBertForSequenceClassification` (QDQBert model) - `Qwen2Config` configuration class: `Qwen2ForSequenceClassification` (Qwen2 model) - `Qwen2MoeConfig` configuration class: `Qwen2MoeForSequenceClassification` (Qwen2MoE model) - `Qwen3Config` configuration class: `Qwen3ForSequenceClassification` (Qwen3 model) - `Qwen3MoeConfig` configuration class: `Qwen3MoeForSequenceClassification` (Qwen3MoE model) - `Qwen3NextConfig` configuration class: `Qwen3NextForSequenceClassification` (Qwen3Next model) - `ReformerConfig` configuration class: `ReformerForSequenceClassification` (Reformer model) - `RemBertConfig` configuration class: `RemBertForSequenceClassification` (RemBERT model) - `RoCBertConfig` configuration class: `RoCBertForSequenceClassification` (RoCBert model) - `RoFormerConfig` configuration class: `RoFormerForSequenceClassification` (RoFormer model) - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaForSequenceClassification) (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForSequenceClassification` (RoBERTa-PreLayerNorm model) - `SeedOssConfig` configuration class: `SeedOssForSequenceClassification` (SeedOss model) - `SmolLM3Config` configuration class: `SmolLM3ForSequenceClassification` (SmolLM3 model) - `SqueezeBertConfig` configuration class: `SqueezeBertForSequenceClassification` (SqueezeBERT model) - `StableLmConfig` configuration class: `StableLmForSequenceClassification` (StableLm model) - `Starcoder2Config` configuration class: `Starcoder2ForSequenceClassification` (Starcoder2 model) - `T5Config` configuration class: `T5ForSequenceClassification` (T5 model) - `T5GemmaConfig` configuration class: `T5GemmaForSequenceClassification` (T5Gemma model) - `TapasConfig` configuration class: `TapasForSequenceClassification` (TAPAS model) - `TransfoXLConfig` configuration class: `TransfoXLForSequenceClassification` (Transformer-XL model) - `UMT5Config` configuration class: `UMT5ForSequenceClassification` (UMT5 model) - `XLMConfig` configuration class: `XLMForSequenceClassification` (XLM model) - `XLMRobertaConfig` configuration class: `XLMRobertaForSequenceClassification` (XLM-RoBERTa model) - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForSequenceClassification` (XLM-RoBERTa-XL model) - `XLNetConfig` configuration class: `XLNetForSequenceClassification` (XLNet model) - `XmodConfig` configuration class: `XmodForSequenceClassification` (X-MOD model) - `YosoConfig` configuration class: `YosoForSequenceClassification` (YOSO model) - `Zamba2Config` configuration class: `Zamba2ForSequenceClassification` (Zamba2 model) - `ZambaConfig` configuration class: `ZambaForSequenceClassification` (Zamba model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForSequenceClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a sequence classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- `AlbertForSequenceClassification` (ALBERT model)
- **arcee** -- `ArceeForSequenceClassification` (Arcee model)
- **bart** -- [BartForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartForSequenceClassification) (BART model)
- **bert** -- [BertForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertForSequenceClassification) (BERT model)
- **big_bird** -- `BigBirdForSequenceClassification` (BigBird model)
- **bigbird_pegasus** -- `BigBirdPegasusForSequenceClassification` (BigBird-Pegasus model)
- **biogpt** -- [BioGptForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/biogpt#transformers.BioGptForSequenceClassification) (BioGpt model)
- **bloom** -- `BloomForSequenceClassification` (BLOOM model)
- **camembert** -- `CamembertForSequenceClassification` (CamemBERT model)
- **canine** -- `CanineForSequenceClassification` (CANINE model)
- **code_llama** -- [LlamaForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaForSequenceClassification) (CodeLlama model)
- **convbert** -- [ConvBertForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertForSequenceClassification) (ConvBERT model)
- **ctrl** -- `CTRLForSequenceClassification` (CTRL model)
- **data2vec-text** -- `Data2VecTextForSequenceClassification` (Data2VecText model)
- **deberta** -- [DebertaForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaForSequenceClassification) (DeBERTa model)
- **deberta-v2** -- [DebertaV2ForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2ForSequenceClassification) (DeBERTa-v2 model)
- **deepseek_v2** -- `DeepseekV2ForSequenceClassification` (DeepSeek-V2 model)
- **deepseek_v3** -- `DeepseekV3ForSequenceClassification` (DeepSeek-V3 model)
- **diffllama** -- `DiffLlamaForSequenceClassification` (DiffLlama model)
- **distilbert** -- `DistilBertForSequenceClassification` (DistilBERT model)
- **doge** -- `DogeForSequenceClassification` (Doge model)
- **electra** -- [ElectraForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraForSequenceClassification) (ELECTRA model)
- **ernie** -- `ErnieForSequenceClassification` (ERNIE model)
- **ernie_m** -- `ErnieMForSequenceClassification` (ErnieM model)
- **esm** -- [EsmForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmForSequenceClassification) (ESM model)
- **exaone4** -- [Exaone4ForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4ForSequenceClassification) (EXAONE-4.0 model)
- **falcon** -- `FalconForSequenceClassification` (Falcon model)
- **flaubert** -- `FlaubertForSequenceClassification` (FlauBERT model)
- **fnet** -- `FNetForSequenceClassification` (FNet model)
- **funnel** -- `FunnelForSequenceClassification` (Funnel Transformer model)
- **gemma** -- [GemmaForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaForSequenceClassification) (Gemma model)
- **gemma2** -- [Gemma2ForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/gemma2#transformers.Gemma2ForSequenceClassification) (Gemma2 model)
- **gemma3** -- [Gemma3ForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/gemma3#transformers.Gemma3ForSequenceClassification) (Gemma3ForConditionalGeneration model)
- **gemma3_text** -- `Gemma3TextForSequenceClassification` (Gemma3ForCausalLM model)
- **glm** -- `GlmForSequenceClassification` (GLM model)
- **glm4** -- `Glm4ForSequenceClassification` (GLM4 model)
- **gpt-sw3** -- [GPT2ForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2ForSequenceClassification) (GPT-Sw3 model)
- **gpt2** -- [GPT2ForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2ForSequenceClassification) (OpenAI GPT-2 model)
- **gpt_bigcode** -- `GPTBigCodeForSequenceClassification` (GPTBigCode model)
- **gpt_neo** -- `GPTNeoForSequenceClassification` (GPT Neo model)
- **gpt_neox** -- `GPTNeoXForSequenceClassification` (GPT NeoX model)
- **gpt_oss** -- `GptOssForSequenceClassification` (GptOss model)
- **gptj** -- `GPTJForSequenceClassification` (GPT-J model)
- **helium** -- `HeliumForSequenceClassification` (Helium model)
- **hunyuan_v1_dense** -- `HunYuanDenseV1ForSequenceClassification` (HunYuanDenseV1 model)
- **hunyuan_v1_moe** -- `HunYuanMoEV1ForSequenceClassification` (HunYuanMoeV1 model)
- **ibert** -- `IBertForSequenceClassification` (I-BERT model)
- **jamba** -- [JambaForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/jamba#transformers.JambaForSequenceClassification) (Jamba model)
- **jetmoe** -- `JetMoeForSequenceClassification` (JetMoe model)
- **layoutlm** -- `LayoutLMForSequenceClassification` (LayoutLM model)
- **layoutlmv2** -- `LayoutLMv2ForSequenceClassification` (LayoutLMv2 model)
- **layoutlmv3** -- `LayoutLMv3ForSequenceClassification` (LayoutLMv3 model)
- **led** -- `LEDForSequenceClassification` (LED model)
- **lilt** -- `LiltForSequenceClassification` (LiLT model)
- **llama** -- [LlamaForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaForSequenceClassification) (LLaMA model)
- **longformer** -- `LongformerForSequenceClassification` (Longformer model)
- **luke** -- `LukeForSequenceClassification` (LUKE model)
- **markuplm** -- `MarkupLMForSequenceClassification` (MarkupLM model)
- **mbart** -- `MBartForSequenceClassification` (mBART model)
- **mega** -- `MegaForSequenceClassification` (MEGA model)
- **megatron-bert** -- `MegatronBertForSequenceClassification` (Megatron-BERT model)
- **minimax** -- `MiniMaxForSequenceClassification` (MiniMax model)
- **ministral** -- `MinistralForSequenceClassification` (Ministral model)
- **mistral** -- [MistralForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralForSequenceClassification) (Mistral model)
- **mixtral** -- `MixtralForSequenceClassification` (Mixtral model)
- **mobilebert** -- `MobileBertForSequenceClassification` (MobileBERT model)
- **modernbert** -- `ModernBertForSequenceClassification` (ModernBERT model)
- **modernbert-decoder** -- `ModernBertDecoderForSequenceClassification` (ModernBertDecoder model)
- **mpnet** -- `MPNetForSequenceClassification` (MPNet model)
- **mpt** -- `MptForSequenceClassification` (MPT model)
- **mra** -- `MraForSequenceClassification` (MRA model)
- **mt5** -- `MT5ForSequenceClassification` (MT5 model)
- **mvp** -- `MvpForSequenceClassification` (MVP model)
- **nemotron** -- `NemotronForSequenceClassification` (Nemotron model)
- **nezha** -- `NezhaForSequenceClassification` (Nezha model)
- **nystromformer** -- `NystromformerForSequenceClassification` (Nyströmformer model)
- **open-llama** -- `OpenLlamaForSequenceClassification` (OpenLlama model)
- **openai-gpt** -- [OpenAIGPTForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTForSequenceClassification) (OpenAI GPT model)
- **opt** -- `OPTForSequenceClassification` (OPT model)
- **perceiver** -- `PerceiverForSequenceClassification` (Perceiver model)
- **persimmon** -- `PersimmonForSequenceClassification` (Persimmon model)
- **phi** -- `PhiForSequenceClassification` (Phi model)
- **phi3** -- `Phi3ForSequenceClassification` (Phi3 model)
- **phimoe** -- `PhimoeForSequenceClassification` (Phimoe model)
- **plbart** -- `PLBartForSequenceClassification` (PLBart model)
- **qdqbert** -- `QDQBertForSequenceClassification` (QDQBert model)
- **qwen2** -- `Qwen2ForSequenceClassification` (Qwen2 model)
- **qwen2_moe** -- `Qwen2MoeForSequenceClassification` (Qwen2MoE model)
- **qwen3** -- `Qwen3ForSequenceClassification` (Qwen3 model)
- **qwen3_moe** -- `Qwen3MoeForSequenceClassification` (Qwen3MoE model)
- **qwen3_next** -- `Qwen3NextForSequenceClassification` (Qwen3Next model)
- **reformer** -- `ReformerForSequenceClassification` (Reformer model)
- **rembert** -- `RemBertForSequenceClassification` (RemBERT model)
- **roberta** -- [RobertaForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaForSequenceClassification) (RoBERTa model)
- **roberta-prelayernorm** -- `RobertaPreLayerNormForSequenceClassification` (RoBERTa-PreLayerNorm model)
- **roc_bert** -- `RoCBertForSequenceClassification` (RoCBert model)
- **roformer** -- `RoFormerForSequenceClassification` (RoFormer model)
- **seed_oss** -- `SeedOssForSequenceClassification` (SeedOss model)
- **smollm3** -- `SmolLM3ForSequenceClassification` (SmolLM3 model)
- **squeezebert** -- `SqueezeBertForSequenceClassification` (SqueezeBERT model)
- **stablelm** -- `StableLmForSequenceClassification` (StableLm model)
- **starcoder2** -- `Starcoder2ForSequenceClassification` (Starcoder2 model)
- **t5** -- `T5ForSequenceClassification` (T5 model)
- **t5gemma** -- `T5GemmaForSequenceClassification` (T5Gemma model)
- **tapas** -- `TapasForSequenceClassification` (TAPAS model)
- **transfo-xl** -- `TransfoXLForSequenceClassification` (Transformer-XL model)
- **umt5** -- `UMT5ForSequenceClassification` (UMT5 model)
- **xlm** -- `XLMForSequenceClassification` (XLM model)
- **xlm-roberta** -- `XLMRobertaForSequenceClassification` (XLM-RoBERTa model)
- **xlm-roberta-xl** -- `XLMRobertaXLForSequenceClassification` (XLM-RoBERTa-XL model)
- **xlnet** -- `XLNetForSequenceClassification` (XLNet model)
- **xmod** -- `XmodForSequenceClassification` (X-MOD model)
- **yoso** -- `YosoForSequenceClassification` (YOSO model)
- **zamba** -- `ZambaForSequenceClassification` (Zamba model)
- **zamba2** -- `Zamba2ForSequenceClassification` (Zamba2 model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForSequenceClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForSequenceClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForSequenceClassification[[transformers.TFAutoModelForSequenceClassification]][[transformers.TFAutoModelForSequenceClassification]]

#### transformers.TFAutoModelForSequenceClassification[[transformers.TFAutoModelForSequenceClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L637)

This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForSequenceClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `TFAlbertForSequenceClassification` (ALBERT model)
  - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [TFBartForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.TFBartForSequenceClassification) (BART model)
  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertForSequenceClassification) (BERT model)
  - `CTRLConfig` configuration class: `TFCTRLForSequenceClassification` (CTRL model)
  - `CamembertConfig` configuration class: `TFCamembertForSequenceClassification` (CamemBERT model)
  - [ConvBertConfig](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertConfig) configuration class: [TFConvBertForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.TFConvBertForSequenceClassification) (ConvBERT model)
  - [DebertaConfig](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaConfig) configuration class: [TFDebertaForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.TFDebertaForSequenceClassification) (DeBERTa model)
  - [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [TFDebertaV2ForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.TFDebertaV2ForSequenceClassification) (DeBERTa-v2 model)
  - `DistilBertConfig` configuration class: `TFDistilBertForSequenceClassification` (DistilBERT model)
  - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [TFElectraForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.TFElectraForSequenceClassification) (ELECTRA model)
  - [EsmConfig](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmConfig) configuration class: [TFEsmForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.TFEsmForSequenceClassification) (ESM model)
  - `FlaubertConfig` configuration class: `TFFlaubertForSequenceClassification` (FlauBERT model)
  - `FunnelConfig` configuration class: `TFFunnelForSequenceClassification` (Funnel Transformer model)
  - [GPT2Config](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Config) configuration class: [TFGPT2ForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.TFGPT2ForSequenceClassification) (OpenAI GPT-2 model)
  - `GPTJConfig` configuration class: `TFGPTJForSequenceClassification` (GPT-J model)
  - `LayoutLMConfig` configuration class: `TFLayoutLMForSequenceClassification` (LayoutLM model)
  - `LayoutLMv3Config` configuration class: `TFLayoutLMv3ForSequenceClassification` (LayoutLMv3 model)
  - `LongformerConfig` configuration class: `TFLongformerForSequenceClassification` (Longformer model)
  - `MPNetConfig` configuration class: `TFMPNetForSequenceClassification` (MPNet model)
  - [MistralConfig](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralConfig) configuration class: [TFMistralForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.TFMistralForSequenceClassification) (Mistral model)
  - `MobileBertConfig` configuration class: `TFMobileBertForSequenceClassification` (MobileBERT model)
  - [OpenAIGPTConfig](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTConfig) configuration class: [TFOpenAIGPTForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.TFOpenAIGPTForSequenceClassification) (OpenAI GPT model)
  - `RemBertConfig` configuration class: `TFRemBertForSequenceClassification` (RemBERT model)
  - `RoFormerConfig` configuration class: `TFRoFormerForSequenceClassification` (RoFormer model)
  - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [TFRobertaForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.TFRobertaForSequenceClassification) (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForSequenceClassification` (RoBERTa-PreLayerNorm model)
  - `TapasConfig` configuration class: `TFTapasForSequenceClassification` (TAPAS model)
  - `TransfoXLConfig` configuration class: `TFTransfoXLForSequenceClassification` (Transformer-XL model)
  - `XLMConfig` configuration class: `TFXLMForSequenceClassification` (XLM model)
  - `XLMRobertaConfig` configuration class: `TFXLMRobertaForSequenceClassification` (XLM-RoBERTa model)
  - `XLNetConfig` configuration class: `TFXLNetForSequenceClassification` (XLNet model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a sequence classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForSequenceClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForSequenceClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `TFAlbertForSequenceClassification` (ALBERT model) - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [TFBartForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.TFBartForSequenceClassification) (BART model) - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertForSequenceClassification) (BERT model) - `CTRLConfig` configuration class: `TFCTRLForSequenceClassification` (CTRL model) - `CamembertConfig` configuration class: `TFCamembertForSequenceClassification` (CamemBERT model) - [ConvBertConfig](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertConfig) configuration class: [TFConvBertForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.TFConvBertForSequenceClassification) (ConvBERT model) - [DebertaConfig](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaConfig) configuration class: [TFDebertaForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.TFDebertaForSequenceClassification) (DeBERTa model) - [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [TFDebertaV2ForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.TFDebertaV2ForSequenceClassification) (DeBERTa-v2 model) - `DistilBertConfig` configuration class: `TFDistilBertForSequenceClassification` (DistilBERT model) - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [TFElectraForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.TFElectraForSequenceClassification) (ELECTRA model) - [EsmConfig](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmConfig) configuration class: [TFEsmForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.TFEsmForSequenceClassification) (ESM model) - `FlaubertConfig` configuration class: `TFFlaubertForSequenceClassification` (FlauBERT model) - `FunnelConfig` configuration class: `TFFunnelForSequenceClassification` (Funnel Transformer model) - [GPT2Config](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Config) configuration class: [TFGPT2ForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.TFGPT2ForSequenceClassification) (OpenAI GPT-2 model) - `GPTJConfig` configuration class: `TFGPTJForSequenceClassification` (GPT-J model) - `LayoutLMConfig` configuration class: `TFLayoutLMForSequenceClassification` (LayoutLM model) - `LayoutLMv3Config` configuration class: `TFLayoutLMv3ForSequenceClassification` (LayoutLMv3 model) - `LongformerConfig` configuration class: `TFLongformerForSequenceClassification` (Longformer model) - `MPNetConfig` configuration class: `TFMPNetForSequenceClassification` (MPNet model) - [MistralConfig](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralConfig) configuration class: [TFMistralForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.TFMistralForSequenceClassification) (Mistral model) - `MobileBertConfig` configuration class: `TFMobileBertForSequenceClassification` (MobileBERT model) - [OpenAIGPTConfig](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.OpenAIGPTConfig) configuration class: [TFOpenAIGPTForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.TFOpenAIGPTForSequenceClassification) (OpenAI GPT model) - `RemBertConfig` configuration class: `TFRemBertForSequenceClassification` (RemBERT model) - `RoFormerConfig` configuration class: `TFRoFormerForSequenceClassification` (RoFormer model) - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [TFRobertaForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.TFRobertaForSequenceClassification) (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForSequenceClassification` (RoBERTa-PreLayerNorm model) - `TapasConfig` configuration class: `TFTapasForSequenceClassification` (TAPAS model) - `TransfoXLConfig` configuration class: `TFTransfoXLForSequenceClassification` (Transformer-XL model) - `XLMConfig` configuration class: `TFXLMForSequenceClassification` (XLM model) - `XLMRobertaConfig` configuration class: `TFXLMRobertaForSequenceClassification` (XLM-RoBERTa model) - `XLNetConfig` configuration class: `TFXLNetForSequenceClassification` (XLNet model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForSequenceClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a sequence classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- `TFAlbertForSequenceClassification` (ALBERT model)
- **bart** -- [TFBartForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.TFBartForSequenceClassification) (BART model)
- **bert** -- [TFBertForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertForSequenceClassification) (BERT model)
- **camembert** -- `TFCamembertForSequenceClassification` (CamemBERT model)
- **convbert** -- [TFConvBertForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.TFConvBertForSequenceClassification) (ConvBERT model)
- **ctrl** -- `TFCTRLForSequenceClassification` (CTRL model)
- **deberta** -- [TFDebertaForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.TFDebertaForSequenceClassification) (DeBERTa model)
- **deberta-v2** -- [TFDebertaV2ForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.TFDebertaV2ForSequenceClassification) (DeBERTa-v2 model)
- **distilbert** -- `TFDistilBertForSequenceClassification` (DistilBERT model)
- **electra** -- [TFElectraForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.TFElectraForSequenceClassification) (ELECTRA model)
- **esm** -- [TFEsmForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.TFEsmForSequenceClassification) (ESM model)
- **flaubert** -- `TFFlaubertForSequenceClassification` (FlauBERT model)
- **funnel** -- `TFFunnelForSequenceClassification` (Funnel Transformer model)
- **gpt-sw3** -- [TFGPT2ForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.TFGPT2ForSequenceClassification) (GPT-Sw3 model)
- **gpt2** -- [TFGPT2ForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.TFGPT2ForSequenceClassification) (OpenAI GPT-2 model)
- **gptj** -- `TFGPTJForSequenceClassification` (GPT-J model)
- **layoutlm** -- `TFLayoutLMForSequenceClassification` (LayoutLM model)
- **layoutlmv3** -- `TFLayoutLMv3ForSequenceClassification` (LayoutLMv3 model)
- **longformer** -- `TFLongformerForSequenceClassification` (Longformer model)
- **mistral** -- [TFMistralForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.TFMistralForSequenceClassification) (Mistral model)
- **mobilebert** -- `TFMobileBertForSequenceClassification` (MobileBERT model)
- **mpnet** -- `TFMPNetForSequenceClassification` (MPNet model)
- **openai-gpt** -- [TFOpenAIGPTForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/openai-gpt#transformers.TFOpenAIGPTForSequenceClassification) (OpenAI GPT model)
- **rembert** -- `TFRemBertForSequenceClassification` (RemBERT model)
- **roberta** -- [TFRobertaForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.TFRobertaForSequenceClassification) (RoBERTa model)
- **roberta-prelayernorm** -- `TFRobertaPreLayerNormForSequenceClassification` (RoBERTa-PreLayerNorm model)
- **roformer** -- `TFRoFormerForSequenceClassification` (RoFormer model)
- **tapas** -- `TFTapasForSequenceClassification` (TAPAS model)
- **transfo-xl** -- `TFTransfoXLForSequenceClassification` (Transformer-XL model)
- **xlm** -- `TFXLMForSequenceClassification` (XLM model)
- **xlm-roberta** -- `TFXLMRobertaForSequenceClassification` (XLM-RoBERTa model)
- **xlnet** -- `TFXLNetForSequenceClassification` (XLNet model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForSequenceClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForSequenceClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### FlaxAutoModelForSequenceClassification[[transformers.FlaxAutoModelForSequenceClassification]][[transformers.FlaxAutoModelForSequenceClassification]]

#### transformers.FlaxAutoModelForSequenceClassification[[transformers.FlaxAutoModelForSequenceClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_flax_auto.py#L320)

This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.FlaxAutoModelForSequenceClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `FlaxAlbertForSequenceClassification` (ALBERT model)
  - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.FlaxBartForSequenceClassification) (BART model)
  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertForSequenceClassification) (BERT model)
  - `BigBirdConfig` configuration class: `FlaxBigBirdForSequenceClassification` (BigBird model)
  - `DistilBertConfig` configuration class: `FlaxDistilBertForSequenceClassification` (DistilBERT model)
  - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [FlaxElectraForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.FlaxElectraForSequenceClassification) (ELECTRA model)
  - `MBartConfig` configuration class: `FlaxMBartForSequenceClassification` (mBART model)
  - `RoFormerConfig` configuration class: `FlaxRoFormerForSequenceClassification` (RoFormer model)
  - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [FlaxRobertaForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.FlaxRobertaForSequenceClassification) (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForSequenceClassification` (RoBERTa-PreLayerNorm model)
  - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForSequenceClassification` (XLM-RoBERTa model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a sequence classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForSequenceClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForSequenceClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `FlaxAlbertForSequenceClassification` (ALBERT model) - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.FlaxBartForSequenceClassification) (BART model) - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertForSequenceClassification) (BERT model) - `BigBirdConfig` configuration class: `FlaxBigBirdForSequenceClassification` (BigBird model) - `DistilBertConfig` configuration class: `FlaxDistilBertForSequenceClassification` (DistilBERT model) - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [FlaxElectraForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.FlaxElectraForSequenceClassification) (ELECTRA model) - `MBartConfig` configuration class: `FlaxMBartForSequenceClassification` (mBART model) - `RoFormerConfig` configuration class: `FlaxRoFormerForSequenceClassification` (RoFormer model) - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [FlaxRobertaForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.FlaxRobertaForSequenceClassification) (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForSequenceClassification` (RoBERTa-PreLayerNorm model) - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForSequenceClassification` (XLM-RoBERTa model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.FlaxAutoModelForSequenceClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a sequence classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- `FlaxAlbertForSequenceClassification` (ALBERT model)
- **bart** -- [FlaxBartForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.FlaxBartForSequenceClassification) (BART model)
- **bert** -- [FlaxBertForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertForSequenceClassification) (BERT model)
- **big_bird** -- `FlaxBigBirdForSequenceClassification` (BigBird model)
- **distilbert** -- `FlaxDistilBertForSequenceClassification` (DistilBERT model)
- **electra** -- [FlaxElectraForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.FlaxElectraForSequenceClassification) (ELECTRA model)
- **mbart** -- `FlaxMBartForSequenceClassification` (mBART model)
- **roberta** -- [FlaxRobertaForSequenceClassification](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.FlaxRobertaForSequenceClassification) (RoBERTa model)
- **roberta-prelayernorm** -- `FlaxRobertaPreLayerNormForSequenceClassification` (RoBERTa-PreLayerNorm model)
- **roformer** -- `FlaxRoFormerForSequenceClassification` (RoFormer model)
- **xlm-roberta** -- `FlaxXLMRobertaForSequenceClassification` (XLM-RoBERTa model)

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForSequenceClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForSequenceClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForMultipleChoice[[transformers.AutoModelForMultipleChoice]][[transformers.AutoModelForMultipleChoice]]

#### transformers.AutoModelForMultipleChoice[[transformers.AutoModelForMultipleChoice]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2053)

This is a generic model class that will be instantiated as one of the model classes of the library (with a multiple choice head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForMultipleChoice.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: [AlbertForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertForMultipleChoice) (ALBERT model)
  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [BertForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertForMultipleChoice) (BERT model)
  - `BigBirdConfig` configuration class: `BigBirdForMultipleChoice` (BigBird model)
  - `CamembertConfig` configuration class: `CamembertForMultipleChoice` (CamemBERT model)
  - `CanineConfig` configuration class: `CanineForMultipleChoice` (CANINE model)
  - [ConvBertConfig](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertForMultipleChoice) (ConvBERT model)
  - `Data2VecTextConfig` configuration class: `Data2VecTextForMultipleChoice` (Data2VecText model)
  - [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2ForMultipleChoice) (DeBERTa-v2 model)
  - `DistilBertConfig` configuration class: `DistilBertForMultipleChoice` (DistilBERT model)
  - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraForMultipleChoice) (ELECTRA model)
  - `ErnieConfig` configuration class: `ErnieForMultipleChoice` (ERNIE model)
  - `ErnieMConfig` configuration class: `ErnieMForMultipleChoice` (ErnieM model)
  - `FNetConfig` configuration class: `FNetForMultipleChoice` (FNet model)
  - `FlaubertConfig` configuration class: `FlaubertForMultipleChoice` (FlauBERT model)
  - `FunnelConfig` configuration class: `FunnelForMultipleChoice` (Funnel Transformer model)
  - `IBertConfig` configuration class: `IBertForMultipleChoice` (I-BERT model)
  - `LongformerConfig` configuration class: `LongformerForMultipleChoice` (Longformer model)
  - `LukeConfig` configuration class: `LukeForMultipleChoice` (LUKE model)
  - `MPNetConfig` configuration class: `MPNetForMultipleChoice` (MPNet model)
  - `MegaConfig` configuration class: `MegaForMultipleChoice` (MEGA model)
  - `MegatronBertConfig` configuration class: `MegatronBertForMultipleChoice` (Megatron-BERT model)
  - `MobileBertConfig` configuration class: `MobileBertForMultipleChoice` (MobileBERT model)
  - `ModernBertConfig` configuration class: `ModernBertForMultipleChoice` (ModernBERT model)
  - `MraConfig` configuration class: `MraForMultipleChoice` (MRA model)
  - `NezhaConfig` configuration class: `NezhaForMultipleChoice` (Nezha model)
  - `NystromformerConfig` configuration class: `NystromformerForMultipleChoice` (Nyströmformer model)
  - `QDQBertConfig` configuration class: `QDQBertForMultipleChoice` (QDQBert model)
  - `RemBertConfig` configuration class: `RemBertForMultipleChoice` (RemBERT model)
  - `RoCBertConfig` configuration class: `RoCBertForMultipleChoice` (RoCBert model)
  - `RoFormerConfig` configuration class: `RoFormerForMultipleChoice` (RoFormer model)
  - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaForMultipleChoice) (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForMultipleChoice` (RoBERTa-PreLayerNorm model)
  - `SqueezeBertConfig` configuration class: `SqueezeBertForMultipleChoice` (SqueezeBERT model)
  - `XLMConfig` configuration class: `XLMForMultipleChoice` (XLM model)
  - `XLMRobertaConfig` configuration class: `XLMRobertaForMultipleChoice` (XLM-RoBERTa model)
  - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForMultipleChoice` (XLM-RoBERTa-XL model)
  - `XLNetConfig` configuration class: `XLNetForMultipleChoice` (XLNet model)
  - `XmodConfig` configuration class: `XmodForMultipleChoice` (X-MOD model)
  - `YosoConfig` configuration class: `YosoForMultipleChoice` (YOSO model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a multiple choice head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForMultipleChoice

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForMultipleChoice.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: [AlbertForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertForMultipleChoice) (ALBERT model) - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [BertForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertForMultipleChoice) (BERT model) - `BigBirdConfig` configuration class: `BigBirdForMultipleChoice` (BigBird model) - `CamembertConfig` configuration class: `CamembertForMultipleChoice` (CamemBERT model) - `CanineConfig` configuration class: `CanineForMultipleChoice` (CANINE model) - [ConvBertConfig](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertForMultipleChoice) (ConvBERT model) - `Data2VecTextConfig` configuration class: `Data2VecTextForMultipleChoice` (Data2VecText model) - [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2ForMultipleChoice) (DeBERTa-v2 model) - `DistilBertConfig` configuration class: `DistilBertForMultipleChoice` (DistilBERT model) - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraForMultipleChoice) (ELECTRA model) - `ErnieConfig` configuration class: `ErnieForMultipleChoice` (ERNIE model) - `ErnieMConfig` configuration class: `ErnieMForMultipleChoice` (ErnieM model) - `FNetConfig` configuration class: `FNetForMultipleChoice` (FNet model) - `FlaubertConfig` configuration class: `FlaubertForMultipleChoice` (FlauBERT model) - `FunnelConfig` configuration class: `FunnelForMultipleChoice` (Funnel Transformer model) - `IBertConfig` configuration class: `IBertForMultipleChoice` (I-BERT model) - `LongformerConfig` configuration class: `LongformerForMultipleChoice` (Longformer model) - `LukeConfig` configuration class: `LukeForMultipleChoice` (LUKE model) - `MPNetConfig` configuration class: `MPNetForMultipleChoice` (MPNet model) - `MegaConfig` configuration class: `MegaForMultipleChoice` (MEGA model) - `MegatronBertConfig` configuration class: `MegatronBertForMultipleChoice` (Megatron-BERT model) - `MobileBertConfig` configuration class: `MobileBertForMultipleChoice` (MobileBERT model) - `ModernBertConfig` configuration class: `ModernBertForMultipleChoice` (ModernBERT model) - `MraConfig` configuration class: `MraForMultipleChoice` (MRA model) - `NezhaConfig` configuration class: `NezhaForMultipleChoice` (Nezha model) - `NystromformerConfig` configuration class: `NystromformerForMultipleChoice` (Nyströmformer model) - `QDQBertConfig` configuration class: `QDQBertForMultipleChoice` (QDQBert model) - `RemBertConfig` configuration class: `RemBertForMultipleChoice` (RemBERT model) - `RoCBertConfig` configuration class: `RoCBertForMultipleChoice` (RoCBert model) - `RoFormerConfig` configuration class: `RoFormerForMultipleChoice` (RoFormer model) - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaForMultipleChoice) (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForMultipleChoice` (RoBERTa-PreLayerNorm model) - `SqueezeBertConfig` configuration class: `SqueezeBertForMultipleChoice` (SqueezeBERT model) - `XLMConfig` configuration class: `XLMForMultipleChoice` (XLM model) - `XLMRobertaConfig` configuration class: `XLMRobertaForMultipleChoice` (XLM-RoBERTa model) - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForMultipleChoice` (XLM-RoBERTa-XL model) - `XLNetConfig` configuration class: `XLNetForMultipleChoice` (XLNet model) - `XmodConfig` configuration class: `XmodForMultipleChoice` (X-MOD model) - `YosoConfig` configuration class: `YosoForMultipleChoice` (YOSO model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForMultipleChoice.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a multiple choice head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- [AlbertForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertForMultipleChoice) (ALBERT model)
- **bert** -- [BertForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertForMultipleChoice) (BERT model)
- **big_bird** -- `BigBirdForMultipleChoice` (BigBird model)
- **camembert** -- `CamembertForMultipleChoice` (CamemBERT model)
- **canine** -- `CanineForMultipleChoice` (CANINE model)
- **convbert** -- [ConvBertForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertForMultipleChoice) (ConvBERT model)
- **data2vec-text** -- `Data2VecTextForMultipleChoice` (Data2VecText model)
- **deberta-v2** -- [DebertaV2ForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2ForMultipleChoice) (DeBERTa-v2 model)
- **distilbert** -- `DistilBertForMultipleChoice` (DistilBERT model)
- **electra** -- [ElectraForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraForMultipleChoice) (ELECTRA model)
- **ernie** -- `ErnieForMultipleChoice` (ERNIE model)
- **ernie_m** -- `ErnieMForMultipleChoice` (ErnieM model)
- **flaubert** -- `FlaubertForMultipleChoice` (FlauBERT model)
- **fnet** -- `FNetForMultipleChoice` (FNet model)
- **funnel** -- `FunnelForMultipleChoice` (Funnel Transformer model)
- **ibert** -- `IBertForMultipleChoice` (I-BERT model)
- **longformer** -- `LongformerForMultipleChoice` (Longformer model)
- **luke** -- `LukeForMultipleChoice` (LUKE model)
- **mega** -- `MegaForMultipleChoice` (MEGA model)
- **megatron-bert** -- `MegatronBertForMultipleChoice` (Megatron-BERT model)
- **mobilebert** -- `MobileBertForMultipleChoice` (MobileBERT model)
- **modernbert** -- `ModernBertForMultipleChoice` (ModernBERT model)
- **mpnet** -- `MPNetForMultipleChoice` (MPNet model)
- **mra** -- `MraForMultipleChoice` (MRA model)
- **nezha** -- `NezhaForMultipleChoice` (Nezha model)
- **nystromformer** -- `NystromformerForMultipleChoice` (Nyströmformer model)
- **qdqbert** -- `QDQBertForMultipleChoice` (QDQBert model)
- **rembert** -- `RemBertForMultipleChoice` (RemBERT model)
- **roberta** -- [RobertaForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaForMultipleChoice) (RoBERTa model)
- **roberta-prelayernorm** -- `RobertaPreLayerNormForMultipleChoice` (RoBERTa-PreLayerNorm model)
- **roc_bert** -- `RoCBertForMultipleChoice` (RoCBert model)
- **roformer** -- `RoFormerForMultipleChoice` (RoFormer model)
- **squeezebert** -- `SqueezeBertForMultipleChoice` (SqueezeBERT model)
- **xlm** -- `XLMForMultipleChoice` (XLM model)
- **xlm-roberta** -- `XLMRobertaForMultipleChoice` (XLM-RoBERTa model)
- **xlm-roberta-xl** -- `XLMRobertaXLForMultipleChoice` (XLM-RoBERTa-XL model)
- **xlnet** -- `XLNetForMultipleChoice` (XLNet model)
- **xmod** -- `XmodForMultipleChoice` (X-MOD model)
- **yoso** -- `YosoForMultipleChoice` (YOSO model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForMultipleChoice

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForMultipleChoice.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForMultipleChoice[[transformers.TFAutoModelForMultipleChoice]][[transformers.TFAutoModelForMultipleChoice]]

#### transformers.TFAutoModelForMultipleChoice[[transformers.TFAutoModelForMultipleChoice]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L684)

This is a generic model class that will be instantiated as one of the model classes of the library (with a multiple choice head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForMultipleChoice.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `TFAlbertForMultipleChoice` (ALBERT model)
  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertForMultipleChoice) (BERT model)
  - `CamembertConfig` configuration class: `TFCamembertForMultipleChoice` (CamemBERT model)
  - [ConvBertConfig](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertConfig) configuration class: [TFConvBertForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.TFConvBertForMultipleChoice) (ConvBERT model)
  - [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [TFDebertaV2ForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.TFDebertaV2ForMultipleChoice) (DeBERTa-v2 model)
  - `DistilBertConfig` configuration class: `TFDistilBertForMultipleChoice` (DistilBERT model)
  - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [TFElectraForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.TFElectraForMultipleChoice) (ELECTRA model)
  - `FlaubertConfig` configuration class: `TFFlaubertForMultipleChoice` (FlauBERT model)
  - `FunnelConfig` configuration class: `TFFunnelForMultipleChoice` (Funnel Transformer model)
  - `LongformerConfig` configuration class: `TFLongformerForMultipleChoice` (Longformer model)
  - `MPNetConfig` configuration class: `TFMPNetForMultipleChoice` (MPNet model)
  - `MobileBertConfig` configuration class: `TFMobileBertForMultipleChoice` (MobileBERT model)
  - `RemBertConfig` configuration class: `TFRemBertForMultipleChoice` (RemBERT model)
  - `RoFormerConfig` configuration class: `TFRoFormerForMultipleChoice` (RoFormer model)
  - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [TFRobertaForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.TFRobertaForMultipleChoice) (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForMultipleChoice` (RoBERTa-PreLayerNorm model)
  - `XLMConfig` configuration class: `TFXLMForMultipleChoice` (XLM model)
  - `XLMRobertaConfig` configuration class: `TFXLMRobertaForMultipleChoice` (XLM-RoBERTa model)
  - `XLNetConfig` configuration class: `TFXLNetForMultipleChoice` (XLNet model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a multiple choice head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForMultipleChoice

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForMultipleChoice.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `TFAlbertForMultipleChoice` (ALBERT model) - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertForMultipleChoice) (BERT model) - `CamembertConfig` configuration class: `TFCamembertForMultipleChoice` (CamemBERT model) - [ConvBertConfig](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertConfig) configuration class: [TFConvBertForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.TFConvBertForMultipleChoice) (ConvBERT model) - [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [TFDebertaV2ForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.TFDebertaV2ForMultipleChoice) (DeBERTa-v2 model) - `DistilBertConfig` configuration class: `TFDistilBertForMultipleChoice` (DistilBERT model) - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [TFElectraForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.TFElectraForMultipleChoice) (ELECTRA model) - `FlaubertConfig` configuration class: `TFFlaubertForMultipleChoice` (FlauBERT model) - `FunnelConfig` configuration class: `TFFunnelForMultipleChoice` (Funnel Transformer model) - `LongformerConfig` configuration class: `TFLongformerForMultipleChoice` (Longformer model) - `MPNetConfig` configuration class: `TFMPNetForMultipleChoice` (MPNet model) - `MobileBertConfig` configuration class: `TFMobileBertForMultipleChoice` (MobileBERT model) - `RemBertConfig` configuration class: `TFRemBertForMultipleChoice` (RemBERT model) - `RoFormerConfig` configuration class: `TFRoFormerForMultipleChoice` (RoFormer model) - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [TFRobertaForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.TFRobertaForMultipleChoice) (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForMultipleChoice` (RoBERTa-PreLayerNorm model) - `XLMConfig` configuration class: `TFXLMForMultipleChoice` (XLM model) - `XLMRobertaConfig` configuration class: `TFXLMRobertaForMultipleChoice` (XLM-RoBERTa model) - `XLNetConfig` configuration class: `TFXLNetForMultipleChoice` (XLNet model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForMultipleChoice.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a multiple choice head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- `TFAlbertForMultipleChoice` (ALBERT model)
- **bert** -- [TFBertForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertForMultipleChoice) (BERT model)
- **camembert** -- `TFCamembertForMultipleChoice` (CamemBERT model)
- **convbert** -- [TFConvBertForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.TFConvBertForMultipleChoice) (ConvBERT model)
- **deberta-v2** -- [TFDebertaV2ForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.TFDebertaV2ForMultipleChoice) (DeBERTa-v2 model)
- **distilbert** -- `TFDistilBertForMultipleChoice` (DistilBERT model)
- **electra** -- [TFElectraForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.TFElectraForMultipleChoice) (ELECTRA model)
- **flaubert** -- `TFFlaubertForMultipleChoice` (FlauBERT model)
- **funnel** -- `TFFunnelForMultipleChoice` (Funnel Transformer model)
- **longformer** -- `TFLongformerForMultipleChoice` (Longformer model)
- **mobilebert** -- `TFMobileBertForMultipleChoice` (MobileBERT model)
- **mpnet** -- `TFMPNetForMultipleChoice` (MPNet model)
- **rembert** -- `TFRemBertForMultipleChoice` (RemBERT model)
- **roberta** -- [TFRobertaForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.TFRobertaForMultipleChoice) (RoBERTa model)
- **roberta-prelayernorm** -- `TFRobertaPreLayerNormForMultipleChoice` (RoBERTa-PreLayerNorm model)
- **roformer** -- `TFRoFormerForMultipleChoice` (RoFormer model)
- **xlm** -- `TFXLMForMultipleChoice` (XLM model)
- **xlm-roberta** -- `TFXLMRobertaForMultipleChoice` (XLM-RoBERTa model)
- **xlnet** -- `TFXLNetForMultipleChoice` (XLNet model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForMultipleChoice

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForMultipleChoice.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### FlaxAutoModelForMultipleChoice[[transformers.FlaxAutoModelForMultipleChoice]][[transformers.FlaxAutoModelForMultipleChoice]]

#### transformers.FlaxAutoModelForMultipleChoice[[transformers.FlaxAutoModelForMultipleChoice]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_flax_auto.py#L345)

This is a generic model class that will be instantiated as one of the model classes of the library (with a multiple choice head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.FlaxAutoModelForMultipleChoice.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `FlaxAlbertForMultipleChoice` (ALBERT model)
  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertForMultipleChoice) (BERT model)
  - `BigBirdConfig` configuration class: `FlaxBigBirdForMultipleChoice` (BigBird model)
  - `DistilBertConfig` configuration class: `FlaxDistilBertForMultipleChoice` (DistilBERT model)
  - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [FlaxElectraForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.FlaxElectraForMultipleChoice) (ELECTRA model)
  - `RoFormerConfig` configuration class: `FlaxRoFormerForMultipleChoice` (RoFormer model)
  - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [FlaxRobertaForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.FlaxRobertaForMultipleChoice) (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForMultipleChoice` (RoBERTa-PreLayerNorm model)
  - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForMultipleChoice` (XLM-RoBERTa model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a multiple choice head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForMultipleChoice

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForMultipleChoice.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `FlaxAlbertForMultipleChoice` (ALBERT model) - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertForMultipleChoice) (BERT model) - `BigBirdConfig` configuration class: `FlaxBigBirdForMultipleChoice` (BigBird model) - `DistilBertConfig` configuration class: `FlaxDistilBertForMultipleChoice` (DistilBERT model) - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [FlaxElectraForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.FlaxElectraForMultipleChoice) (ELECTRA model) - `RoFormerConfig` configuration class: `FlaxRoFormerForMultipleChoice` (RoFormer model) - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [FlaxRobertaForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.FlaxRobertaForMultipleChoice) (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForMultipleChoice` (RoBERTa-PreLayerNorm model) - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForMultipleChoice` (XLM-RoBERTa model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.FlaxAutoModelForMultipleChoice.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a multiple choice head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- `FlaxAlbertForMultipleChoice` (ALBERT model)
- **bert** -- [FlaxBertForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertForMultipleChoice) (BERT model)
- **big_bird** -- `FlaxBigBirdForMultipleChoice` (BigBird model)
- **distilbert** -- `FlaxDistilBertForMultipleChoice` (DistilBERT model)
- **electra** -- [FlaxElectraForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.FlaxElectraForMultipleChoice) (ELECTRA model)
- **roberta** -- [FlaxRobertaForMultipleChoice](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.FlaxRobertaForMultipleChoice) (RoBERTa model)
- **roberta-prelayernorm** -- `FlaxRobertaPreLayerNormForMultipleChoice` (RoBERTa-PreLayerNorm model)
- **roformer** -- `FlaxRoFormerForMultipleChoice` (RoFormer model)
- **xlm-roberta** -- `FlaxXLMRobertaForMultipleChoice` (XLM-RoBERTa model)

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForMultipleChoice

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForMultipleChoice.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForNextSentencePrediction[[transformers.AutoModelForNextSentencePrediction]][[transformers.AutoModelForNextSentencePrediction]]

#### transformers.AutoModelForNextSentencePrediction[[transformers.AutoModelForNextSentencePrediction]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2060)

This is a generic model class that will be instantiated as one of the model classes of the library (with a next sentence prediction head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForNextSentencePrediction.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [BertForNextSentencePrediction](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertForNextSentencePrediction) (BERT model)
  - `ErnieConfig` configuration class: `ErnieForNextSentencePrediction` (ERNIE model)
  - `FNetConfig` configuration class: `FNetForNextSentencePrediction` (FNet model)
  - `MegatronBertConfig` configuration class: `MegatronBertForNextSentencePrediction` (Megatron-BERT model)
  - `MobileBertConfig` configuration class: `MobileBertForNextSentencePrediction` (MobileBERT model)
  - `NezhaConfig` configuration class: `NezhaForNextSentencePrediction` (Nezha model)
  - `QDQBertConfig` configuration class: `QDQBertForNextSentencePrediction` (QDQBert model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a next sentence prediction head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForNextSentencePrediction

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForNextSentencePrediction.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [BertForNextSentencePrediction](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertForNextSentencePrediction) (BERT model) - `ErnieConfig` configuration class: `ErnieForNextSentencePrediction` (ERNIE model) - `FNetConfig` configuration class: `FNetForNextSentencePrediction` (FNet model) - `MegatronBertConfig` configuration class: `MegatronBertForNextSentencePrediction` (Megatron-BERT model) - `MobileBertConfig` configuration class: `MobileBertForNextSentencePrediction` (MobileBERT model) - `NezhaConfig` configuration class: `NezhaForNextSentencePrediction` (Nezha model) - `QDQBertConfig` configuration class: `QDQBertForNextSentencePrediction` (QDQBert model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForNextSentencePrediction.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a next sentence prediction head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **bert** -- [BertForNextSentencePrediction](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertForNextSentencePrediction) (BERT model)
- **ernie** -- `ErnieForNextSentencePrediction` (ERNIE model)
- **fnet** -- `FNetForNextSentencePrediction` (FNet model)
- **megatron-bert** -- `MegatronBertForNextSentencePrediction` (Megatron-BERT model)
- **mobilebert** -- `MobileBertForNextSentencePrediction` (MobileBERT model)
- **nezha** -- `NezhaForNextSentencePrediction` (Nezha model)
- **qdqbert** -- `QDQBertForNextSentencePrediction` (QDQBert model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForNextSentencePrediction

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForNextSentencePrediction.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForNextSentencePrediction[[transformers.TFAutoModelForNextSentencePrediction]][[transformers.TFAutoModelForNextSentencePrediction]]

#### transformers.TFAutoModelForNextSentencePrediction[[transformers.TFAutoModelForNextSentencePrediction]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L691)

This is a generic model class that will be instantiated as one of the model classes of the library (with a next sentence prediction head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForNextSentencePrediction.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForNextSentencePrediction](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertForNextSentencePrediction) (BERT model)
  - `MobileBertConfig` configuration class: `TFMobileBertForNextSentencePrediction` (MobileBERT model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a next sentence prediction head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForNextSentencePrediction

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForNextSentencePrediction.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForNextSentencePrediction](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertForNextSentencePrediction) (BERT model) - `MobileBertConfig` configuration class: `TFMobileBertForNextSentencePrediction` (MobileBERT model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForNextSentencePrediction.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a next sentence prediction head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **bert** -- [TFBertForNextSentencePrediction](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertForNextSentencePrediction) (BERT model)
- **mobilebert** -- `TFMobileBertForNextSentencePrediction` (MobileBERT model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForNextSentencePrediction

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForNextSentencePrediction.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### FlaxAutoModelForNextSentencePrediction[[transformers.FlaxAutoModelForNextSentencePrediction]][[transformers.FlaxAutoModelForNextSentencePrediction]]

#### transformers.FlaxAutoModelForNextSentencePrediction[[transformers.FlaxAutoModelForNextSentencePrediction]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_flax_auto.py#L352)

This is a generic model class that will be instantiated as one of the model classes of the library (with a next sentence prediction head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.FlaxAutoModelForNextSentencePrediction.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForNextSentencePrediction](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertForNextSentencePrediction) (BERT model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a next sentence prediction head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForNextSentencePrediction

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForNextSentencePrediction.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForNextSentencePrediction](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertForNextSentencePrediction) (BERT model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.FlaxAutoModelForNextSentencePrediction.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a next sentence prediction head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **bert** -- [FlaxBertForNextSentencePrediction](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertForNextSentencePrediction) (BERT model)

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForNextSentencePrediction

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForNextSentencePrediction.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForTokenClassification[[transformers.AutoModelForTokenClassification]][[transformers.AutoModelForTokenClassification]]

#### transformers.AutoModelForTokenClassification[[transformers.AutoModelForTokenClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2046)

This is a generic model class that will be instantiated as one of the model classes of the library (with a token classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForTokenClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `AlbertForTokenClassification` (ALBERT model)
  - `ApertusConfig` configuration class: `ApertusForTokenClassification` (Apertus model)
  - `ArceeConfig` configuration class: `ArceeForTokenClassification` (Arcee model)
  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [BertForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertForTokenClassification) (BERT model)
  - `BigBirdConfig` configuration class: `BigBirdForTokenClassification` (BigBird model)
  - [BioGptConfig](/docs/transformers/v4.57.1/ko/model_doc/biogpt#transformers.BioGptConfig) configuration class: [BioGptForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/biogpt#transformers.BioGptForTokenClassification) (BioGpt model)
  - `BloomConfig` configuration class: `BloomForTokenClassification` (BLOOM model)
  - `BrosConfig` configuration class: `BrosForTokenClassification` (BROS model)
  - `CamembertConfig` configuration class: `CamembertForTokenClassification` (CamemBERT model)
  - `CanineConfig` configuration class: `CanineForTokenClassification` (CANINE model)
  - [ConvBertConfig](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertForTokenClassification) (ConvBERT model)
  - `Data2VecTextConfig` configuration class: `Data2VecTextForTokenClassification` (Data2VecText model)
  - [DebertaConfig](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaForTokenClassification) (DeBERTa model)
  - [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2ForTokenClassification) (DeBERTa-v2 model)
  - [DeepseekV3Config](/docs/transformers/v4.57.1/ko/model_doc/deepseek_v3#transformers.DeepseekV3Config) configuration class: `DeepseekV3ForTokenClassification` (DeepSeek-V3 model)
  - `DiffLlamaConfig` configuration class: `DiffLlamaForTokenClassification` (DiffLlama model)
  - `DistilBertConfig` configuration class: `DistilBertForTokenClassification` (DistilBERT model)
  - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraForTokenClassification) (ELECTRA model)
  - `ErnieConfig` configuration class: `ErnieForTokenClassification` (ERNIE model)
  - `ErnieMConfig` configuration class: `ErnieMForTokenClassification` (ErnieM model)
  - [EsmConfig](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmConfig) configuration class: [EsmForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmForTokenClassification) (ESM model)
  - [Exaone4Config](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4Config) configuration class: [Exaone4ForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4ForTokenClassification) (EXAONE-4.0 model)
  - `FNetConfig` configuration class: `FNetForTokenClassification` (FNet model)
  - `FalconConfig` configuration class: `FalconForTokenClassification` (Falcon model)
  - `FlaubertConfig` configuration class: `FlaubertForTokenClassification` (FlauBERT model)
  - `FunnelConfig` configuration class: `FunnelForTokenClassification` (Funnel Transformer model)
  - [GPT2Config](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Config) configuration class: [GPT2ForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2ForTokenClassification) (OpenAI GPT-2 model)
  - `GPTBigCodeConfig` configuration class: `GPTBigCodeForTokenClassification` (GPTBigCode model)
  - `GPTNeoConfig` configuration class: `GPTNeoForTokenClassification` (GPT Neo model)
  - `GPTNeoXConfig` configuration class: `GPTNeoXForTokenClassification` (GPT NeoX model)
  - [Gemma2Config](/docs/transformers/v4.57.1/ko/model_doc/gemma2#transformers.Gemma2Config) configuration class: [Gemma2ForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/gemma2#transformers.Gemma2ForTokenClassification) (Gemma2 model)
  - [GemmaConfig](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaConfig) configuration class: [GemmaForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaForTokenClassification) (Gemma model)
  - `Glm4Config` configuration class: `Glm4ForTokenClassification` (GLM4 model)
  - `GlmConfig` configuration class: `GlmForTokenClassification` (GLM model)
  - `GptOssConfig` configuration class: `GptOssForTokenClassification` (GptOss model)
  - `HeliumConfig` configuration class: `HeliumForTokenClassification` (Helium model)
  - `IBertConfig` configuration class: `IBertForTokenClassification` (I-BERT model)
  - `LayoutLMConfig` configuration class: `LayoutLMForTokenClassification` (LayoutLM model)
  - `LayoutLMv2Config` configuration class: `LayoutLMv2ForTokenClassification` (LayoutLMv2 model)
  - `LayoutLMv3Config` configuration class: `LayoutLMv3ForTokenClassification` (LayoutLMv3 model)
  - `LiltConfig` configuration class: `LiltForTokenClassification` (LiLT model)
  - [LlamaConfig](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaConfig) configuration class: `LlamaForTokenClassification` (LLaMA model)
  - `LongformerConfig` configuration class: `LongformerForTokenClassification` (Longformer model)
  - `LukeConfig` configuration class: `LukeForTokenClassification` (LUKE model)
  - `MPNetConfig` configuration class: `MPNetForTokenClassification` (MPNet model)
  - `MT5Config` configuration class: `MT5ForTokenClassification` (MT5 model)
  - `MarkupLMConfig` configuration class: `MarkupLMForTokenClassification` (MarkupLM model)
  - `MegaConfig` configuration class: `MegaForTokenClassification` (MEGA model)
  - `MegatronBertConfig` configuration class: `MegatronBertForTokenClassification` (Megatron-BERT model)
  - `MiniMaxConfig` configuration class: `MiniMaxForTokenClassification` (MiniMax model)
  - `MinistralConfig` configuration class: `MinistralForTokenClassification` (Ministral model)
  - [MistralConfig](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralConfig) configuration class: [MistralForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralForTokenClassification) (Mistral model)
  - `MixtralConfig` configuration class: `MixtralForTokenClassification` (Mixtral model)
  - `MobileBertConfig` configuration class: `MobileBertForTokenClassification` (MobileBERT model)
  - `ModernBertConfig` configuration class: `ModernBertForTokenClassification` (ModernBERT model)
  - `MptConfig` configuration class: `MptForTokenClassification` (MPT model)
  - `MraConfig` configuration class: `MraForTokenClassification` (MRA model)
  - `NemotronConfig` configuration class: `NemotronForTokenClassification` (Nemotron model)
  - `NezhaConfig` configuration class: `NezhaForTokenClassification` (Nezha model)
  - `NystromformerConfig` configuration class: `NystromformerForTokenClassification` (Nyströmformer model)
  - `PersimmonConfig` configuration class: `PersimmonForTokenClassification` (Persimmon model)
  - `Phi3Config` configuration class: `Phi3ForTokenClassification` (Phi3 model)
  - `PhiConfig` configuration class: `PhiForTokenClassification` (Phi model)
  - `QDQBertConfig` configuration class: `QDQBertForTokenClassification` (QDQBert model)
  - `Qwen2Config` configuration class: `Qwen2ForTokenClassification` (Qwen2 model)
  - `Qwen2MoeConfig` configuration class: `Qwen2MoeForTokenClassification` (Qwen2MoE model)
  - `Qwen3Config` configuration class: `Qwen3ForTokenClassification` (Qwen3 model)
  - `Qwen3MoeConfig` configuration class: `Qwen3MoeForTokenClassification` (Qwen3MoE model)
  - `Qwen3NextConfig` configuration class: `Qwen3NextForTokenClassification` (Qwen3Next model)
  - `RemBertConfig` configuration class: `RemBertForTokenClassification` (RemBERT model)
  - `RoCBertConfig` configuration class: `RoCBertForTokenClassification` (RoCBert model)
  - `RoFormerConfig` configuration class: `RoFormerForTokenClassification` (RoFormer model)
  - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaForTokenClassification) (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForTokenClassification` (RoBERTa-PreLayerNorm model)
  - `SeedOssConfig` configuration class: `SeedOssForTokenClassification` (SeedOss model)
  - `SmolLM3Config` configuration class: `SmolLM3ForTokenClassification` (SmolLM3 model)
  - `SqueezeBertConfig` configuration class: `SqueezeBertForTokenClassification` (SqueezeBERT model)
  - `StableLmConfig` configuration class: `StableLmForTokenClassification` (StableLm model)
  - `Starcoder2Config` configuration class: `Starcoder2ForTokenClassification` (Starcoder2 model)
  - `T5Config` configuration class: `T5ForTokenClassification` (T5 model)
  - `T5GemmaConfig` configuration class: `T5GemmaForTokenClassification` (T5Gemma model)
  - `UMT5Config` configuration class: `UMT5ForTokenClassification` (UMT5 model)
  - `XLMConfig` configuration class: `XLMForTokenClassification` (XLM model)
  - `XLMRobertaConfig` configuration class: `XLMRobertaForTokenClassification` (XLM-RoBERTa model)
  - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForTokenClassification` (XLM-RoBERTa-XL model)
  - `XLNetConfig` configuration class: `XLNetForTokenClassification` (XLNet model)
  - `XmodConfig` configuration class: `XmodForTokenClassification` (X-MOD model)
  - `YosoConfig` configuration class: `YosoForTokenClassification` (YOSO model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a token classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForTokenClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForTokenClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `AlbertForTokenClassification` (ALBERT model) - `ApertusConfig` configuration class: `ApertusForTokenClassification` (Apertus model) - `ArceeConfig` configuration class: `ArceeForTokenClassification` (Arcee model) - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [BertForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertForTokenClassification) (BERT model) - `BigBirdConfig` configuration class: `BigBirdForTokenClassification` (BigBird model) - [BioGptConfig](/docs/transformers/v4.57.1/ko/model_doc/biogpt#transformers.BioGptConfig) configuration class: [BioGptForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/biogpt#transformers.BioGptForTokenClassification) (BioGpt model) - `BloomConfig` configuration class: `BloomForTokenClassification` (BLOOM model) - `BrosConfig` configuration class: `BrosForTokenClassification` (BROS model) - `CamembertConfig` configuration class: `CamembertForTokenClassification` (CamemBERT model) - `CanineConfig` configuration class: `CanineForTokenClassification` (CANINE model) - [ConvBertConfig](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertForTokenClassification) (ConvBERT model) - `Data2VecTextConfig` configuration class: `Data2VecTextForTokenClassification` (Data2VecText model) - [DebertaConfig](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaForTokenClassification) (DeBERTa model) - [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2ForTokenClassification) (DeBERTa-v2 model) - [DeepseekV3Config](/docs/transformers/v4.57.1/ko/model_doc/deepseek_v3#transformers.DeepseekV3Config) configuration class: `DeepseekV3ForTokenClassification` (DeepSeek-V3 model) - `DiffLlamaConfig` configuration class: `DiffLlamaForTokenClassification` (DiffLlama model) - `DistilBertConfig` configuration class: `DistilBertForTokenClassification` (DistilBERT model) - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraForTokenClassification) (ELECTRA model) - `ErnieConfig` configuration class: `ErnieForTokenClassification` (ERNIE model) - `ErnieMConfig` configuration class: `ErnieMForTokenClassification` (ErnieM model) - [EsmConfig](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmConfig) configuration class: [EsmForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmForTokenClassification) (ESM model) - [Exaone4Config](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4Config) configuration class: [Exaone4ForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4ForTokenClassification) (EXAONE-4.0 model) - `FNetConfig` configuration class: `FNetForTokenClassification` (FNet model) - `FalconConfig` configuration class: `FalconForTokenClassification` (Falcon model) - `FlaubertConfig` configuration class: `FlaubertForTokenClassification` (FlauBERT model) - `FunnelConfig` configuration class: `FunnelForTokenClassification` (Funnel Transformer model) - [GPT2Config](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Config) configuration class: [GPT2ForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2ForTokenClassification) (OpenAI GPT-2 model) - `GPTBigCodeConfig` configuration class: `GPTBigCodeForTokenClassification` (GPTBigCode model) - `GPTNeoConfig` configuration class: `GPTNeoForTokenClassification` (GPT Neo model) - `GPTNeoXConfig` configuration class: `GPTNeoXForTokenClassification` (GPT NeoX model) - [Gemma2Config](/docs/transformers/v4.57.1/ko/model_doc/gemma2#transformers.Gemma2Config) configuration class: [Gemma2ForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/gemma2#transformers.Gemma2ForTokenClassification) (Gemma2 model) - [GemmaConfig](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaConfig) configuration class: [GemmaForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaForTokenClassification) (Gemma model) - `Glm4Config` configuration class: `Glm4ForTokenClassification` (GLM4 model) - `GlmConfig` configuration class: `GlmForTokenClassification` (GLM model) - `GptOssConfig` configuration class: `GptOssForTokenClassification` (GptOss model) - `HeliumConfig` configuration class: `HeliumForTokenClassification` (Helium model) - `IBertConfig` configuration class: `IBertForTokenClassification` (I-BERT model) - `LayoutLMConfig` configuration class: `LayoutLMForTokenClassification` (LayoutLM model) - `LayoutLMv2Config` configuration class: `LayoutLMv2ForTokenClassification` (LayoutLMv2 model) - `LayoutLMv3Config` configuration class: `LayoutLMv3ForTokenClassification` (LayoutLMv3 model) - `LiltConfig` configuration class: `LiltForTokenClassification` (LiLT model) - [LlamaConfig](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaConfig) configuration class: `LlamaForTokenClassification` (LLaMA model) - `LongformerConfig` configuration class: `LongformerForTokenClassification` (Longformer model) - `LukeConfig` configuration class: `LukeForTokenClassification` (LUKE model) - `MPNetConfig` configuration class: `MPNetForTokenClassification` (MPNet model) - `MT5Config` configuration class: `MT5ForTokenClassification` (MT5 model) - `MarkupLMConfig` configuration class: `MarkupLMForTokenClassification` (MarkupLM model) - `MegaConfig` configuration class: `MegaForTokenClassification` (MEGA model) - `MegatronBertConfig` configuration class: `MegatronBertForTokenClassification` (Megatron-BERT model) - `MiniMaxConfig` configuration class: `MiniMaxForTokenClassification` (MiniMax model) - `MinistralConfig` configuration class: `MinistralForTokenClassification` (Ministral model) - [MistralConfig](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralConfig) configuration class: [MistralForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralForTokenClassification) (Mistral model) - `MixtralConfig` configuration class: `MixtralForTokenClassification` (Mixtral model) - `MobileBertConfig` configuration class: `MobileBertForTokenClassification` (MobileBERT model) - `ModernBertConfig` configuration class: `ModernBertForTokenClassification` (ModernBERT model) - `MptConfig` configuration class: `MptForTokenClassification` (MPT model) - `MraConfig` configuration class: `MraForTokenClassification` (MRA model) - `NemotronConfig` configuration class: `NemotronForTokenClassification` (Nemotron model) - `NezhaConfig` configuration class: `NezhaForTokenClassification` (Nezha model) - `NystromformerConfig` configuration class: `NystromformerForTokenClassification` (Nyströmformer model) - `PersimmonConfig` configuration class: `PersimmonForTokenClassification` (Persimmon model) - `Phi3Config` configuration class: `Phi3ForTokenClassification` (Phi3 model) - `PhiConfig` configuration class: `PhiForTokenClassification` (Phi model) - `QDQBertConfig` configuration class: `QDQBertForTokenClassification` (QDQBert model) - `Qwen2Config` configuration class: `Qwen2ForTokenClassification` (Qwen2 model) - `Qwen2MoeConfig` configuration class: `Qwen2MoeForTokenClassification` (Qwen2MoE model) - `Qwen3Config` configuration class: `Qwen3ForTokenClassification` (Qwen3 model) - `Qwen3MoeConfig` configuration class: `Qwen3MoeForTokenClassification` (Qwen3MoE model) - `Qwen3NextConfig` configuration class: `Qwen3NextForTokenClassification` (Qwen3Next model) - `RemBertConfig` configuration class: `RemBertForTokenClassification` (RemBERT model) - `RoCBertConfig` configuration class: `RoCBertForTokenClassification` (RoCBert model) - `RoFormerConfig` configuration class: `RoFormerForTokenClassification` (RoFormer model) - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaForTokenClassification) (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForTokenClassification` (RoBERTa-PreLayerNorm model) - `SeedOssConfig` configuration class: `SeedOssForTokenClassification` (SeedOss model) - `SmolLM3Config` configuration class: `SmolLM3ForTokenClassification` (SmolLM3 model) - `SqueezeBertConfig` configuration class: `SqueezeBertForTokenClassification` (SqueezeBERT model) - `StableLmConfig` configuration class: `StableLmForTokenClassification` (StableLm model) - `Starcoder2Config` configuration class: `Starcoder2ForTokenClassification` (Starcoder2 model) - `T5Config` configuration class: `T5ForTokenClassification` (T5 model) - `T5GemmaConfig` configuration class: `T5GemmaForTokenClassification` (T5Gemma model) - `UMT5Config` configuration class: `UMT5ForTokenClassification` (UMT5 model) - `XLMConfig` configuration class: `XLMForTokenClassification` (XLM model) - `XLMRobertaConfig` configuration class: `XLMRobertaForTokenClassification` (XLM-RoBERTa model) - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForTokenClassification` (XLM-RoBERTa-XL model) - `XLNetConfig` configuration class: `XLNetForTokenClassification` (XLNet model) - `XmodConfig` configuration class: `XmodForTokenClassification` (X-MOD model) - `YosoConfig` configuration class: `YosoForTokenClassification` (YOSO model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForTokenClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a token classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- `AlbertForTokenClassification` (ALBERT model)
- **apertus** -- `ApertusForTokenClassification` (Apertus model)
- **arcee** -- `ArceeForTokenClassification` (Arcee model)
- **bert** -- [BertForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertForTokenClassification) (BERT model)
- **big_bird** -- `BigBirdForTokenClassification` (BigBird model)
- **biogpt** -- [BioGptForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/biogpt#transformers.BioGptForTokenClassification) (BioGpt model)
- **bloom** -- `BloomForTokenClassification` (BLOOM model)
- **bros** -- `BrosForTokenClassification` (BROS model)
- **camembert** -- `CamembertForTokenClassification` (CamemBERT model)
- **canine** -- `CanineForTokenClassification` (CANINE model)
- **convbert** -- [ConvBertForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertForTokenClassification) (ConvBERT model)
- **data2vec-text** -- `Data2VecTextForTokenClassification` (Data2VecText model)
- **deberta** -- [DebertaForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaForTokenClassification) (DeBERTa model)
- **deberta-v2** -- [DebertaV2ForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2ForTokenClassification) (DeBERTa-v2 model)
- **deepseek_v3** -- `DeepseekV3ForTokenClassification` (DeepSeek-V3 model)
- **diffllama** -- `DiffLlamaForTokenClassification` (DiffLlama model)
- **distilbert** -- `DistilBertForTokenClassification` (DistilBERT model)
- **electra** -- [ElectraForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraForTokenClassification) (ELECTRA model)
- **ernie** -- `ErnieForTokenClassification` (ERNIE model)
- **ernie_m** -- `ErnieMForTokenClassification` (ErnieM model)
- **esm** -- [EsmForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmForTokenClassification) (ESM model)
- **exaone4** -- [Exaone4ForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4ForTokenClassification) (EXAONE-4.0 model)
- **falcon** -- `FalconForTokenClassification` (Falcon model)
- **flaubert** -- `FlaubertForTokenClassification` (FlauBERT model)
- **fnet** -- `FNetForTokenClassification` (FNet model)
- **funnel** -- `FunnelForTokenClassification` (Funnel Transformer model)
- **gemma** -- [GemmaForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/gemma#transformers.GemmaForTokenClassification) (Gemma model)
- **gemma2** -- [Gemma2ForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/gemma2#transformers.Gemma2ForTokenClassification) (Gemma2 model)
- **glm** -- `GlmForTokenClassification` (GLM model)
- **glm4** -- `Glm4ForTokenClassification` (GLM4 model)
- **gpt-sw3** -- [GPT2ForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2ForTokenClassification) (GPT-Sw3 model)
- **gpt2** -- [GPT2ForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2ForTokenClassification) (OpenAI GPT-2 model)
- **gpt_bigcode** -- `GPTBigCodeForTokenClassification` (GPTBigCode model)
- **gpt_neo** -- `GPTNeoForTokenClassification` (GPT Neo model)
- **gpt_neox** -- `GPTNeoXForTokenClassification` (GPT NeoX model)
- **gpt_oss** -- `GptOssForTokenClassification` (GptOss model)
- **helium** -- `HeliumForTokenClassification` (Helium model)
- **ibert** -- `IBertForTokenClassification` (I-BERT model)
- **layoutlm** -- `LayoutLMForTokenClassification` (LayoutLM model)
- **layoutlmv2** -- `LayoutLMv2ForTokenClassification` (LayoutLMv2 model)
- **layoutlmv3** -- `LayoutLMv3ForTokenClassification` (LayoutLMv3 model)
- **lilt** -- `LiltForTokenClassification` (LiLT model)
- **llama** -- `LlamaForTokenClassification` (LLaMA model)
- **longformer** -- `LongformerForTokenClassification` (Longformer model)
- **luke** -- `LukeForTokenClassification` (LUKE model)
- **markuplm** -- `MarkupLMForTokenClassification` (MarkupLM model)
- **mega** -- `MegaForTokenClassification` (MEGA model)
- **megatron-bert** -- `MegatronBertForTokenClassification` (Megatron-BERT model)
- **minimax** -- `MiniMaxForTokenClassification` (MiniMax model)
- **ministral** -- `MinistralForTokenClassification` (Ministral model)
- **mistral** -- [MistralForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralForTokenClassification) (Mistral model)
- **mixtral** -- `MixtralForTokenClassification` (Mixtral model)
- **mobilebert** -- `MobileBertForTokenClassification` (MobileBERT model)
- **modernbert** -- `ModernBertForTokenClassification` (ModernBERT model)
- **mpnet** -- `MPNetForTokenClassification` (MPNet model)
- **mpt** -- `MptForTokenClassification` (MPT model)
- **mra** -- `MraForTokenClassification` (MRA model)
- **mt5** -- `MT5ForTokenClassification` (MT5 model)
- **nemotron** -- `NemotronForTokenClassification` (Nemotron model)
- **nezha** -- `NezhaForTokenClassification` (Nezha model)
- **nystromformer** -- `NystromformerForTokenClassification` (Nyströmformer model)
- **persimmon** -- `PersimmonForTokenClassification` (Persimmon model)
- **phi** -- `PhiForTokenClassification` (Phi model)
- **phi3** -- `Phi3ForTokenClassification` (Phi3 model)
- **qdqbert** -- `QDQBertForTokenClassification` (QDQBert model)
- **qwen2** -- `Qwen2ForTokenClassification` (Qwen2 model)
- **qwen2_moe** -- `Qwen2MoeForTokenClassification` (Qwen2MoE model)
- **qwen3** -- `Qwen3ForTokenClassification` (Qwen3 model)
- **qwen3_moe** -- `Qwen3MoeForTokenClassification` (Qwen3MoE model)
- **qwen3_next** -- `Qwen3NextForTokenClassification` (Qwen3Next model)
- **rembert** -- `RemBertForTokenClassification` (RemBERT model)
- **roberta** -- [RobertaForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaForTokenClassification) (RoBERTa model)
- **roberta-prelayernorm** -- `RobertaPreLayerNormForTokenClassification` (RoBERTa-PreLayerNorm model)
- **roc_bert** -- `RoCBertForTokenClassification` (RoCBert model)
- **roformer** -- `RoFormerForTokenClassification` (RoFormer model)
- **seed_oss** -- `SeedOssForTokenClassification` (SeedOss model)
- **smollm3** -- `SmolLM3ForTokenClassification` (SmolLM3 model)
- **squeezebert** -- `SqueezeBertForTokenClassification` (SqueezeBERT model)
- **stablelm** -- `StableLmForTokenClassification` (StableLm model)
- **starcoder2** -- `Starcoder2ForTokenClassification` (Starcoder2 model)
- **t5** -- `T5ForTokenClassification` (T5 model)
- **t5gemma** -- `T5GemmaForTokenClassification` (T5Gemma model)
- **umt5** -- `UMT5ForTokenClassification` (UMT5 model)
- **xlm** -- `XLMForTokenClassification` (XLM model)
- **xlm-roberta** -- `XLMRobertaForTokenClassification` (XLM-RoBERTa model)
- **xlm-roberta-xl** -- `XLMRobertaXLForTokenClassification` (XLM-RoBERTa-XL model)
- **xlnet** -- `XLNetForTokenClassification` (XLNet model)
- **xmod** -- `XmodForTokenClassification` (X-MOD model)
- **yoso** -- `YosoForTokenClassification` (YOSO model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForTokenClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForTokenClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForTokenClassification[[transformers.TFAutoModelForTokenClassification]][[transformers.TFAutoModelForTokenClassification]]

#### transformers.TFAutoModelForTokenClassification[[transformers.TFAutoModelForTokenClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L675)

This is a generic model class that will be instantiated as one of the model classes of the library (with a token classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForTokenClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `TFAlbertForTokenClassification` (ALBERT model)
  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertForTokenClassification) (BERT model)
  - `CamembertConfig` configuration class: `TFCamembertForTokenClassification` (CamemBERT model)
  - [ConvBertConfig](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertConfig) configuration class: [TFConvBertForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.TFConvBertForTokenClassification) (ConvBERT model)
  - [DebertaConfig](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaConfig) configuration class: [TFDebertaForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.TFDebertaForTokenClassification) (DeBERTa model)
  - [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [TFDebertaV2ForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.TFDebertaV2ForTokenClassification) (DeBERTa-v2 model)
  - `DistilBertConfig` configuration class: `TFDistilBertForTokenClassification` (DistilBERT model)
  - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [TFElectraForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.TFElectraForTokenClassification) (ELECTRA model)
  - [EsmConfig](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmConfig) configuration class: [TFEsmForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.TFEsmForTokenClassification) (ESM model)
  - `FlaubertConfig` configuration class: `TFFlaubertForTokenClassification` (FlauBERT model)
  - `FunnelConfig` configuration class: `TFFunnelForTokenClassification` (Funnel Transformer model)
  - `LayoutLMConfig` configuration class: `TFLayoutLMForTokenClassification` (LayoutLM model)
  - `LayoutLMv3Config` configuration class: `TFLayoutLMv3ForTokenClassification` (LayoutLMv3 model)
  - `LongformerConfig` configuration class: `TFLongformerForTokenClassification` (Longformer model)
  - `MPNetConfig` configuration class: `TFMPNetForTokenClassification` (MPNet model)
  - `MobileBertConfig` configuration class: `TFMobileBertForTokenClassification` (MobileBERT model)
  - `RemBertConfig` configuration class: `TFRemBertForTokenClassification` (RemBERT model)
  - `RoFormerConfig` configuration class: `TFRoFormerForTokenClassification` (RoFormer model)
  - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [TFRobertaForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.TFRobertaForTokenClassification) (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForTokenClassification` (RoBERTa-PreLayerNorm model)
  - `XLMConfig` configuration class: `TFXLMForTokenClassification` (XLM model)
  - `XLMRobertaConfig` configuration class: `TFXLMRobertaForTokenClassification` (XLM-RoBERTa model)
  - `XLNetConfig` configuration class: `TFXLNetForTokenClassification` (XLNet model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a token classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForTokenClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForTokenClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `TFAlbertForTokenClassification` (ALBERT model) - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertForTokenClassification) (BERT model) - `CamembertConfig` configuration class: `TFCamembertForTokenClassification` (CamemBERT model) - [ConvBertConfig](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertConfig) configuration class: [TFConvBertForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.TFConvBertForTokenClassification) (ConvBERT model) - [DebertaConfig](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaConfig) configuration class: [TFDebertaForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.TFDebertaForTokenClassification) (DeBERTa model) - [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [TFDebertaV2ForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.TFDebertaV2ForTokenClassification) (DeBERTa-v2 model) - `DistilBertConfig` configuration class: `TFDistilBertForTokenClassification` (DistilBERT model) - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [TFElectraForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.TFElectraForTokenClassification) (ELECTRA model) - [EsmConfig](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.EsmConfig) configuration class: [TFEsmForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.TFEsmForTokenClassification) (ESM model) - `FlaubertConfig` configuration class: `TFFlaubertForTokenClassification` (FlauBERT model) - `FunnelConfig` configuration class: `TFFunnelForTokenClassification` (Funnel Transformer model) - `LayoutLMConfig` configuration class: `TFLayoutLMForTokenClassification` (LayoutLM model) - `LayoutLMv3Config` configuration class: `TFLayoutLMv3ForTokenClassification` (LayoutLMv3 model) - `LongformerConfig` configuration class: `TFLongformerForTokenClassification` (Longformer model) - `MPNetConfig` configuration class: `TFMPNetForTokenClassification` (MPNet model) - `MobileBertConfig` configuration class: `TFMobileBertForTokenClassification` (MobileBERT model) - `RemBertConfig` configuration class: `TFRemBertForTokenClassification` (RemBERT model) - `RoFormerConfig` configuration class: `TFRoFormerForTokenClassification` (RoFormer model) - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [TFRobertaForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.TFRobertaForTokenClassification) (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForTokenClassification` (RoBERTa-PreLayerNorm model) - `XLMConfig` configuration class: `TFXLMForTokenClassification` (XLM model) - `XLMRobertaConfig` configuration class: `TFXLMRobertaForTokenClassification` (XLM-RoBERTa model) - `XLNetConfig` configuration class: `TFXLNetForTokenClassification` (XLNet model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForTokenClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a token classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- `TFAlbertForTokenClassification` (ALBERT model)
- **bert** -- [TFBertForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertForTokenClassification) (BERT model)
- **camembert** -- `TFCamembertForTokenClassification` (CamemBERT model)
- **convbert** -- [TFConvBertForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.TFConvBertForTokenClassification) (ConvBERT model)
- **deberta** -- [TFDebertaForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.TFDebertaForTokenClassification) (DeBERTa model)
- **deberta-v2** -- [TFDebertaV2ForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.TFDebertaV2ForTokenClassification) (DeBERTa-v2 model)
- **distilbert** -- `TFDistilBertForTokenClassification` (DistilBERT model)
- **electra** -- [TFElectraForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.TFElectraForTokenClassification) (ELECTRA model)
- **esm** -- [TFEsmForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/esm#transformers.TFEsmForTokenClassification) (ESM model)
- **flaubert** -- `TFFlaubertForTokenClassification` (FlauBERT model)
- **funnel** -- `TFFunnelForTokenClassification` (Funnel Transformer model)
- **layoutlm** -- `TFLayoutLMForTokenClassification` (LayoutLM model)
- **layoutlmv3** -- `TFLayoutLMv3ForTokenClassification` (LayoutLMv3 model)
- **longformer** -- `TFLongformerForTokenClassification` (Longformer model)
- **mobilebert** -- `TFMobileBertForTokenClassification` (MobileBERT model)
- **mpnet** -- `TFMPNetForTokenClassification` (MPNet model)
- **rembert** -- `TFRemBertForTokenClassification` (RemBERT model)
- **roberta** -- [TFRobertaForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.TFRobertaForTokenClassification) (RoBERTa model)
- **roberta-prelayernorm** -- `TFRobertaPreLayerNormForTokenClassification` (RoBERTa-PreLayerNorm model)
- **roformer** -- `TFRoFormerForTokenClassification` (RoFormer model)
- **xlm** -- `TFXLMForTokenClassification` (XLM model)
- **xlm-roberta** -- `TFXLMRobertaForTokenClassification` (XLM-RoBERTa model)
- **xlnet** -- `TFXLNetForTokenClassification` (XLNet model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForTokenClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForTokenClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### FlaxAutoModelForTokenClassification[[transformers.FlaxAutoModelForTokenClassification]][[transformers.FlaxAutoModelForTokenClassification]]

#### transformers.FlaxAutoModelForTokenClassification[[transformers.FlaxAutoModelForTokenClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_flax_auto.py#L336)

This is a generic model class that will be instantiated as one of the model classes of the library (with a token classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.FlaxAutoModelForTokenClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `FlaxAlbertForTokenClassification` (ALBERT model)
  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertForTokenClassification) (BERT model)
  - `BigBirdConfig` configuration class: `FlaxBigBirdForTokenClassification` (BigBird model)
  - `DistilBertConfig` configuration class: `FlaxDistilBertForTokenClassification` (DistilBERT model)
  - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [FlaxElectraForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.FlaxElectraForTokenClassification) (ELECTRA model)
  - `RoFormerConfig` configuration class: `FlaxRoFormerForTokenClassification` (RoFormer model)
  - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [FlaxRobertaForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.FlaxRobertaForTokenClassification) (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForTokenClassification` (RoBERTa-PreLayerNorm model)
  - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForTokenClassification` (XLM-RoBERTa model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a token classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForTokenClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForTokenClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `FlaxAlbertForTokenClassification` (ALBERT model) - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertForTokenClassification) (BERT model) - `BigBirdConfig` configuration class: `FlaxBigBirdForTokenClassification` (BigBird model) - `DistilBertConfig` configuration class: `FlaxDistilBertForTokenClassification` (DistilBERT model) - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [FlaxElectraForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.FlaxElectraForTokenClassification) (ELECTRA model) - `RoFormerConfig` configuration class: `FlaxRoFormerForTokenClassification` (RoFormer model) - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [FlaxRobertaForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.FlaxRobertaForTokenClassification) (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForTokenClassification` (RoBERTa-PreLayerNorm model) - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForTokenClassification` (XLM-RoBERTa model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.FlaxAutoModelForTokenClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a token classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- `FlaxAlbertForTokenClassification` (ALBERT model)
- **bert** -- [FlaxBertForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertForTokenClassification) (BERT model)
- **big_bird** -- `FlaxBigBirdForTokenClassification` (BigBird model)
- **distilbert** -- `FlaxDistilBertForTokenClassification` (DistilBERT model)
- **electra** -- [FlaxElectraForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.FlaxElectraForTokenClassification) (ELECTRA model)
- **roberta** -- [FlaxRobertaForTokenClassification](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.FlaxRobertaForTokenClassification) (RoBERTa model)
- **roberta-prelayernorm** -- `FlaxRobertaPreLayerNormForTokenClassification` (RoBERTa-PreLayerNorm model)
- **roformer** -- `FlaxRoFormerForTokenClassification` (RoFormer model)
- **xlm-roberta** -- `FlaxXLMRobertaForTokenClassification` (XLM-RoBERTa model)

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForTokenClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForTokenClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForQuestionAnswering[[transformers.AutoModelForQuestionAnswering]][[transformers.AutoModelForQuestionAnswering]]

#### transformers.AutoModelForQuestionAnswering[[transformers.AutoModelForQuestionAnswering]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2006)

This is a generic model class that will be instantiated as one of the model classes of the library (with a question answering head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForQuestionAnswering.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `AlbertForQuestionAnswering` (ALBERT model)
  - `ArceeConfig` configuration class: `ArceeForQuestionAnswering` (Arcee model)
  - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [BartForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartForQuestionAnswering) (BART model)
  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [BertForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertForQuestionAnswering) (BERT model)
  - `BigBirdConfig` configuration class: `BigBirdForQuestionAnswering` (BigBird model)
  - `BigBirdPegasusConfig` configuration class: `BigBirdPegasusForQuestionAnswering` (BigBird-Pegasus model)
  - `BloomConfig` configuration class: `BloomForQuestionAnswering` (BLOOM model)
  - `CamembertConfig` configuration class: `CamembertForQuestionAnswering` (CamemBERT model)
  - `CanineConfig` configuration class: `CanineForQuestionAnswering` (CANINE model)
  - [ConvBertConfig](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertForQuestionAnswering) (ConvBERT model)
  - `Data2VecTextConfig` configuration class: `Data2VecTextForQuestionAnswering` (Data2VecText model)
  - [DebertaConfig](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaForQuestionAnswering) (DeBERTa model)
  - [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2ForQuestionAnswering) (DeBERTa-v2 model)
  - `DiffLlamaConfig` configuration class: `DiffLlamaForQuestionAnswering` (DiffLlama model)
  - `DistilBertConfig` configuration class: `DistilBertForQuestionAnswering` (DistilBERT model)
  - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraForQuestionAnswering) (ELECTRA model)
  - `ErnieConfig` configuration class: `ErnieForQuestionAnswering` (ERNIE model)
  - `ErnieMConfig` configuration class: `ErnieMForQuestionAnswering` (ErnieM model)
  - [Exaone4Config](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4Config) configuration class: [Exaone4ForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4ForQuestionAnswering) (EXAONE-4.0 model)
  - `FNetConfig` configuration class: `FNetForQuestionAnswering` (FNet model)
  - `FalconConfig` configuration class: `FalconForQuestionAnswering` (Falcon model)
  - `FlaubertConfig` configuration class: `FlaubertForQuestionAnsweringSimple` (FlauBERT model)
  - `FunnelConfig` configuration class: `FunnelForQuestionAnswering` (Funnel Transformer model)
  - [GPT2Config](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Config) configuration class: [GPT2ForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2ForQuestionAnswering) (OpenAI GPT-2 model)
  - `GPTJConfig` configuration class: `GPTJForQuestionAnswering` (GPT-J model)
  - `GPTNeoConfig` configuration class: `GPTNeoForQuestionAnswering` (GPT Neo model)
  - `GPTNeoXConfig` configuration class: `GPTNeoXForQuestionAnswering` (GPT NeoX model)
  - `IBertConfig` configuration class: `IBertForQuestionAnswering` (I-BERT model)
  - `LEDConfig` configuration class: `LEDForQuestionAnswering` (LED model)
  - `LayoutLMv2Config` configuration class: `LayoutLMv2ForQuestionAnswering` (LayoutLMv2 model)
  - `LayoutLMv3Config` configuration class: `LayoutLMv3ForQuestionAnswering` (LayoutLMv3 model)
  - `LiltConfig` configuration class: `LiltForQuestionAnswering` (LiLT model)
  - [LlamaConfig](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaConfig) configuration class: `LlamaForQuestionAnswering` (LLaMA model)
  - `LongformerConfig` configuration class: `LongformerForQuestionAnswering` (Longformer model)
  - `LukeConfig` configuration class: `LukeForQuestionAnswering` (LUKE model)
  - `LxmertConfig` configuration class: `LxmertForQuestionAnswering` (LXMERT model)
  - `MBartConfig` configuration class: `MBartForQuestionAnswering` (mBART model)
  - `MPNetConfig` configuration class: `MPNetForQuestionAnswering` (MPNet model)
  - `MT5Config` configuration class: `MT5ForQuestionAnswering` (MT5 model)
  - `MarkupLMConfig` configuration class: `MarkupLMForQuestionAnswering` (MarkupLM model)
  - `MegaConfig` configuration class: `MegaForQuestionAnswering` (MEGA model)
  - `MegatronBertConfig` configuration class: `MegatronBertForQuestionAnswering` (Megatron-BERT model)
  - `MiniMaxConfig` configuration class: `MiniMaxForQuestionAnswering` (MiniMax model)
  - `MinistralConfig` configuration class: `MinistralForQuestionAnswering` (Ministral model)
  - [MistralConfig](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralConfig) configuration class: `MistralForQuestionAnswering` (Mistral model)
  - `MixtralConfig` configuration class: `MixtralForQuestionAnswering` (Mixtral model)
  - `MobileBertConfig` configuration class: `MobileBertForQuestionAnswering` (MobileBERT model)
  - `ModernBertConfig` configuration class: `ModernBertForQuestionAnswering` (ModernBERT model)
  - `MptConfig` configuration class: `MptForQuestionAnswering` (MPT model)
  - `MraConfig` configuration class: `MraForQuestionAnswering` (MRA model)
  - `MvpConfig` configuration class: `MvpForQuestionAnswering` (MVP model)
  - `NemotronConfig` configuration class: `NemotronForQuestionAnswering` (Nemotron model)
  - `NezhaConfig` configuration class: `NezhaForQuestionAnswering` (Nezha model)
  - `NystromformerConfig` configuration class: `NystromformerForQuestionAnswering` (Nyströmformer model)
  - `OPTConfig` configuration class: `OPTForQuestionAnswering` (OPT model)
  - `QDQBertConfig` configuration class: `QDQBertForQuestionAnswering` (QDQBert model)
  - `Qwen2Config` configuration class: `Qwen2ForQuestionAnswering` (Qwen2 model)
  - `Qwen2MoeConfig` configuration class: `Qwen2MoeForQuestionAnswering` (Qwen2MoE model)
  - `Qwen3Config` configuration class: `Qwen3ForQuestionAnswering` (Qwen3 model)
  - `Qwen3MoeConfig` configuration class: `Qwen3MoeForQuestionAnswering` (Qwen3MoE model)
  - `Qwen3NextConfig` configuration class: `Qwen3NextForQuestionAnswering` (Qwen3Next model)
  - `ReformerConfig` configuration class: `ReformerForQuestionAnswering` (Reformer model)
  - `RemBertConfig` configuration class: `RemBertForQuestionAnswering` (RemBERT model)
  - `RoCBertConfig` configuration class: `RoCBertForQuestionAnswering` (RoCBert model)
  - `RoFormerConfig` configuration class: `RoFormerForQuestionAnswering` (RoFormer model)
  - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaForQuestionAnswering) (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForQuestionAnswering` (RoBERTa-PreLayerNorm model)
  - `SeedOssConfig` configuration class: `SeedOssForQuestionAnswering` (SeedOss model)
  - `SmolLM3Config` configuration class: `SmolLM3ForQuestionAnswering` (SmolLM3 model)
  - `SplinterConfig` configuration class: `SplinterForQuestionAnswering` (Splinter model)
  - `SqueezeBertConfig` configuration class: `SqueezeBertForQuestionAnswering` (SqueezeBERT model)
  - `T5Config` configuration class: `T5ForQuestionAnswering` (T5 model)
  - `UMT5Config` configuration class: `UMT5ForQuestionAnswering` (UMT5 model)
  - `XLMConfig` configuration class: `XLMForQuestionAnsweringSimple` (XLM model)
  - `XLMRobertaConfig` configuration class: `XLMRobertaForQuestionAnswering` (XLM-RoBERTa model)
  - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForQuestionAnswering` (XLM-RoBERTa-XL model)
  - `XLNetConfig` configuration class: `XLNetForQuestionAnsweringSimple` (XLNet model)
  - `XmodConfig` configuration class: `XmodForQuestionAnswering` (X-MOD model)
  - `YosoConfig` configuration class: `YosoForQuestionAnswering` (YOSO model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a question answering head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForQuestionAnswering.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `AlbertForQuestionAnswering` (ALBERT model) - `ArceeConfig` configuration class: `ArceeForQuestionAnswering` (Arcee model) - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [BartForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartForQuestionAnswering) (BART model) - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [BertForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertForQuestionAnswering) (BERT model) - `BigBirdConfig` configuration class: `BigBirdForQuestionAnswering` (BigBird model) - `BigBirdPegasusConfig` configuration class: `BigBirdPegasusForQuestionAnswering` (BigBird-Pegasus model) - `BloomConfig` configuration class: `BloomForQuestionAnswering` (BLOOM model) - `CamembertConfig` configuration class: `CamembertForQuestionAnswering` (CamemBERT model) - `CanineConfig` configuration class: `CanineForQuestionAnswering` (CANINE model) - [ConvBertConfig](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertForQuestionAnswering) (ConvBERT model) - `Data2VecTextConfig` configuration class: `Data2VecTextForQuestionAnswering` (Data2VecText model) - [DebertaConfig](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaForQuestionAnswering) (DeBERTa model) - [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2ForQuestionAnswering) (DeBERTa-v2 model) - `DiffLlamaConfig` configuration class: `DiffLlamaForQuestionAnswering` (DiffLlama model) - `DistilBertConfig` configuration class: `DistilBertForQuestionAnswering` (DistilBERT model) - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraForQuestionAnswering) (ELECTRA model) - `ErnieConfig` configuration class: `ErnieForQuestionAnswering` (ERNIE model) - `ErnieMConfig` configuration class: `ErnieMForQuestionAnswering` (ErnieM model) - [Exaone4Config](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4Config) configuration class: [Exaone4ForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4ForQuestionAnswering) (EXAONE-4.0 model) - `FNetConfig` configuration class: `FNetForQuestionAnswering` (FNet model) - `FalconConfig` configuration class: `FalconForQuestionAnswering` (Falcon model) - `FlaubertConfig` configuration class: `FlaubertForQuestionAnsweringSimple` (FlauBERT model) - `FunnelConfig` configuration class: `FunnelForQuestionAnswering` (Funnel Transformer model) - [GPT2Config](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2Config) configuration class: [GPT2ForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2ForQuestionAnswering) (OpenAI GPT-2 model) - `GPTJConfig` configuration class: `GPTJForQuestionAnswering` (GPT-J model) - `GPTNeoConfig` configuration class: `GPTNeoForQuestionAnswering` (GPT Neo model) - `GPTNeoXConfig` configuration class: `GPTNeoXForQuestionAnswering` (GPT NeoX model) - `IBertConfig` configuration class: `IBertForQuestionAnswering` (I-BERT model) - `LEDConfig` configuration class: `LEDForQuestionAnswering` (LED model) - `LayoutLMv2Config` configuration class: `LayoutLMv2ForQuestionAnswering` (LayoutLMv2 model) - `LayoutLMv3Config` configuration class: `LayoutLMv3ForQuestionAnswering` (LayoutLMv3 model) - `LiltConfig` configuration class: `LiltForQuestionAnswering` (LiLT model) - [LlamaConfig](/docs/transformers/v4.57.1/ko/model_doc/llama2#transformers.LlamaConfig) configuration class: `LlamaForQuestionAnswering` (LLaMA model) - `LongformerConfig` configuration class: `LongformerForQuestionAnswering` (Longformer model) - `LukeConfig` configuration class: `LukeForQuestionAnswering` (LUKE model) - `LxmertConfig` configuration class: `LxmertForQuestionAnswering` (LXMERT model) - `MBartConfig` configuration class: `MBartForQuestionAnswering` (mBART model) - `MPNetConfig` configuration class: `MPNetForQuestionAnswering` (MPNet model) - `MT5Config` configuration class: `MT5ForQuestionAnswering` (MT5 model) - `MarkupLMConfig` configuration class: `MarkupLMForQuestionAnswering` (MarkupLM model) - `MegaConfig` configuration class: `MegaForQuestionAnswering` (MEGA model) - `MegatronBertConfig` configuration class: `MegatronBertForQuestionAnswering` (Megatron-BERT model) - `MiniMaxConfig` configuration class: `MiniMaxForQuestionAnswering` (MiniMax model) - `MinistralConfig` configuration class: `MinistralForQuestionAnswering` (Ministral model) - [MistralConfig](/docs/transformers/v4.57.1/ko/model_doc/mistral#transformers.MistralConfig) configuration class: `MistralForQuestionAnswering` (Mistral model) - `MixtralConfig` configuration class: `MixtralForQuestionAnswering` (Mixtral model) - `MobileBertConfig` configuration class: `MobileBertForQuestionAnswering` (MobileBERT model) - `ModernBertConfig` configuration class: `ModernBertForQuestionAnswering` (ModernBERT model) - `MptConfig` configuration class: `MptForQuestionAnswering` (MPT model) - `MraConfig` configuration class: `MraForQuestionAnswering` (MRA model) - `MvpConfig` configuration class: `MvpForQuestionAnswering` (MVP model) - `NemotronConfig` configuration class: `NemotronForQuestionAnswering` (Nemotron model) - `NezhaConfig` configuration class: `NezhaForQuestionAnswering` (Nezha model) - `NystromformerConfig` configuration class: `NystromformerForQuestionAnswering` (Nyströmformer model) - `OPTConfig` configuration class: `OPTForQuestionAnswering` (OPT model) - `QDQBertConfig` configuration class: `QDQBertForQuestionAnswering` (QDQBert model) - `Qwen2Config` configuration class: `Qwen2ForQuestionAnswering` (Qwen2 model) - `Qwen2MoeConfig` configuration class: `Qwen2MoeForQuestionAnswering` (Qwen2MoE model) - `Qwen3Config` configuration class: `Qwen3ForQuestionAnswering` (Qwen3 model) - `Qwen3MoeConfig` configuration class: `Qwen3MoeForQuestionAnswering` (Qwen3MoE model) - `Qwen3NextConfig` configuration class: `Qwen3NextForQuestionAnswering` (Qwen3Next model) - `ReformerConfig` configuration class: `ReformerForQuestionAnswering` (Reformer model) - `RemBertConfig` configuration class: `RemBertForQuestionAnswering` (RemBERT model) - `RoCBertConfig` configuration class: `RoCBertForQuestionAnswering` (RoCBert model) - `RoFormerConfig` configuration class: `RoFormerForQuestionAnswering` (RoFormer model) - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaForQuestionAnswering) (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `RobertaPreLayerNormForQuestionAnswering` (RoBERTa-PreLayerNorm model) - `SeedOssConfig` configuration class: `SeedOssForQuestionAnswering` (SeedOss model) - `SmolLM3Config` configuration class: `SmolLM3ForQuestionAnswering` (SmolLM3 model) - `SplinterConfig` configuration class: `SplinterForQuestionAnswering` (Splinter model) - `SqueezeBertConfig` configuration class: `SqueezeBertForQuestionAnswering` (SqueezeBERT model) - `T5Config` configuration class: `T5ForQuestionAnswering` (T5 model) - `UMT5Config` configuration class: `UMT5ForQuestionAnswering` (UMT5 model) - `XLMConfig` configuration class: `XLMForQuestionAnsweringSimple` (XLM model) - `XLMRobertaConfig` configuration class: `XLMRobertaForQuestionAnswering` (XLM-RoBERTa model) - `XLMRobertaXLConfig` configuration class: `XLMRobertaXLForQuestionAnswering` (XLM-RoBERTa-XL model) - `XLNetConfig` configuration class: `XLNetForQuestionAnsweringSimple` (XLNet model) - `XmodConfig` configuration class: `XmodForQuestionAnswering` (X-MOD model) - `YosoConfig` configuration class: `YosoForQuestionAnswering` (YOSO model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForQuestionAnswering.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a question answering head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- `AlbertForQuestionAnswering` (ALBERT model)
- **arcee** -- `ArceeForQuestionAnswering` (Arcee model)
- **bart** -- [BartForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartForQuestionAnswering) (BART model)
- **bert** -- [BertForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertForQuestionAnswering) (BERT model)
- **big_bird** -- `BigBirdForQuestionAnswering` (BigBird model)
- **bigbird_pegasus** -- `BigBirdPegasusForQuestionAnswering` (BigBird-Pegasus model)
- **bloom** -- `BloomForQuestionAnswering` (BLOOM model)
- **camembert** -- `CamembertForQuestionAnswering` (CamemBERT model)
- **canine** -- `CanineForQuestionAnswering` (CANINE model)
- **convbert** -- [ConvBertForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertForQuestionAnswering) (ConvBERT model)
- **data2vec-text** -- `Data2VecTextForQuestionAnswering` (Data2VecText model)
- **deberta** -- [DebertaForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaForQuestionAnswering) (DeBERTa model)
- **deberta-v2** -- [DebertaV2ForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2ForQuestionAnswering) (DeBERTa-v2 model)
- **diffllama** -- `DiffLlamaForQuestionAnswering` (DiffLlama model)
- **distilbert** -- `DistilBertForQuestionAnswering` (DistilBERT model)
- **electra** -- [ElectraForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraForQuestionAnswering) (ELECTRA model)
- **ernie** -- `ErnieForQuestionAnswering` (ERNIE model)
- **ernie_m** -- `ErnieMForQuestionAnswering` (ErnieM model)
- **exaone4** -- [Exaone4ForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/exaone4#transformers.Exaone4ForQuestionAnswering) (EXAONE-4.0 model)
- **falcon** -- `FalconForQuestionAnswering` (Falcon model)
- **flaubert** -- `FlaubertForQuestionAnsweringSimple` (FlauBERT model)
- **fnet** -- `FNetForQuestionAnswering` (FNet model)
- **funnel** -- `FunnelForQuestionAnswering` (Funnel Transformer model)
- **gpt2** -- [GPT2ForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/gpt2#transformers.GPT2ForQuestionAnswering) (OpenAI GPT-2 model)
- **gpt_neo** -- `GPTNeoForQuestionAnswering` (GPT Neo model)
- **gpt_neox** -- `GPTNeoXForQuestionAnswering` (GPT NeoX model)
- **gptj** -- `GPTJForQuestionAnswering` (GPT-J model)
- **ibert** -- `IBertForQuestionAnswering` (I-BERT model)
- **layoutlmv2** -- `LayoutLMv2ForQuestionAnswering` (LayoutLMv2 model)
- **layoutlmv3** -- `LayoutLMv3ForQuestionAnswering` (LayoutLMv3 model)
- **led** -- `LEDForQuestionAnswering` (LED model)
- **lilt** -- `LiltForQuestionAnswering` (LiLT model)
- **llama** -- `LlamaForQuestionAnswering` (LLaMA model)
- **longformer** -- `LongformerForQuestionAnswering` (Longformer model)
- **luke** -- `LukeForQuestionAnswering` (LUKE model)
- **lxmert** -- `LxmertForQuestionAnswering` (LXMERT model)
- **markuplm** -- `MarkupLMForQuestionAnswering` (MarkupLM model)
- **mbart** -- `MBartForQuestionAnswering` (mBART model)
- **mega** -- `MegaForQuestionAnswering` (MEGA model)
- **megatron-bert** -- `MegatronBertForQuestionAnswering` (Megatron-BERT model)
- **minimax** -- `MiniMaxForQuestionAnswering` (MiniMax model)
- **ministral** -- `MinistralForQuestionAnswering` (Ministral model)
- **mistral** -- `MistralForQuestionAnswering` (Mistral model)
- **mixtral** -- `MixtralForQuestionAnswering` (Mixtral model)
- **mobilebert** -- `MobileBertForQuestionAnswering` (MobileBERT model)
- **modernbert** -- `ModernBertForQuestionAnswering` (ModernBERT model)
- **mpnet** -- `MPNetForQuestionAnswering` (MPNet model)
- **mpt** -- `MptForQuestionAnswering` (MPT model)
- **mra** -- `MraForQuestionAnswering` (MRA model)
- **mt5** -- `MT5ForQuestionAnswering` (MT5 model)
- **mvp** -- `MvpForQuestionAnswering` (MVP model)
- **nemotron** -- `NemotronForQuestionAnswering` (Nemotron model)
- **nezha** -- `NezhaForQuestionAnswering` (Nezha model)
- **nystromformer** -- `NystromformerForQuestionAnswering` (Nyströmformer model)
- **opt** -- `OPTForQuestionAnswering` (OPT model)
- **qdqbert** -- `QDQBertForQuestionAnswering` (QDQBert model)
- **qwen2** -- `Qwen2ForQuestionAnswering` (Qwen2 model)
- **qwen2_moe** -- `Qwen2MoeForQuestionAnswering` (Qwen2MoE model)
- **qwen3** -- `Qwen3ForQuestionAnswering` (Qwen3 model)
- **qwen3_moe** -- `Qwen3MoeForQuestionAnswering` (Qwen3MoE model)
- **qwen3_next** -- `Qwen3NextForQuestionAnswering` (Qwen3Next model)
- **reformer** -- `ReformerForQuestionAnswering` (Reformer model)
- **rembert** -- `RemBertForQuestionAnswering` (RemBERT model)
- **roberta** -- [RobertaForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaForQuestionAnswering) (RoBERTa model)
- **roberta-prelayernorm** -- `RobertaPreLayerNormForQuestionAnswering` (RoBERTa-PreLayerNorm model)
- **roc_bert** -- `RoCBertForQuestionAnswering` (RoCBert model)
- **roformer** -- `RoFormerForQuestionAnswering` (RoFormer model)
- **seed_oss** -- `SeedOssForQuestionAnswering` (SeedOss model)
- **smollm3** -- `SmolLM3ForQuestionAnswering` (SmolLM3 model)
- **splinter** -- `SplinterForQuestionAnswering` (Splinter model)
- **squeezebert** -- `SqueezeBertForQuestionAnswering` (SqueezeBERT model)
- **t5** -- `T5ForQuestionAnswering` (T5 model)
- **umt5** -- `UMT5ForQuestionAnswering` (UMT5 model)
- **xlm** -- `XLMForQuestionAnsweringSimple` (XLM model)
- **xlm-roberta** -- `XLMRobertaForQuestionAnswering` (XLM-RoBERTa model)
- **xlm-roberta-xl** -- `XLMRobertaXLForQuestionAnswering` (XLM-RoBERTa-XL model)
- **xlnet** -- `XLNetForQuestionAnsweringSimple` (XLNet model)
- **xmod** -- `XmodForQuestionAnswering` (X-MOD model)
- **yoso** -- `YosoForQuestionAnswering` (YOSO model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForQuestionAnswering.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForQuestionAnswering[[transformers.TFAutoModelForQuestionAnswering]][[transformers.TFAutoModelForQuestionAnswering]]

#### transformers.TFAutoModelForQuestionAnswering[[transformers.TFAutoModelForQuestionAnswering]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L646)

This is a generic model class that will be instantiated as one of the model classes of the library (with a question answering head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForQuestionAnswering.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `TFAlbertForQuestionAnswering` (ALBERT model)
  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertForQuestionAnswering) (BERT model)
  - `CamembertConfig` configuration class: `TFCamembertForQuestionAnswering` (CamemBERT model)
  - [ConvBertConfig](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertConfig) configuration class: [TFConvBertForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.TFConvBertForQuestionAnswering) (ConvBERT model)
  - [DebertaConfig](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaConfig) configuration class: [TFDebertaForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.TFDebertaForQuestionAnswering) (DeBERTa model)
  - [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [TFDebertaV2ForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.TFDebertaV2ForQuestionAnswering) (DeBERTa-v2 model)
  - `DistilBertConfig` configuration class: `TFDistilBertForQuestionAnswering` (DistilBERT model)
  - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [TFElectraForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.TFElectraForQuestionAnswering) (ELECTRA model)
  - `FlaubertConfig` configuration class: `TFFlaubertForQuestionAnsweringSimple` (FlauBERT model)
  - `FunnelConfig` configuration class: `TFFunnelForQuestionAnswering` (Funnel Transformer model)
  - `GPTJConfig` configuration class: `TFGPTJForQuestionAnswering` (GPT-J model)
  - `LayoutLMv3Config` configuration class: `TFLayoutLMv3ForQuestionAnswering` (LayoutLMv3 model)
  - `LongformerConfig` configuration class: `TFLongformerForQuestionAnswering` (Longformer model)
  - `MPNetConfig` configuration class: `TFMPNetForQuestionAnswering` (MPNet model)
  - `MobileBertConfig` configuration class: `TFMobileBertForQuestionAnswering` (MobileBERT model)
  - `RemBertConfig` configuration class: `TFRemBertForQuestionAnswering` (RemBERT model)
  - `RoFormerConfig` configuration class: `TFRoFormerForQuestionAnswering` (RoFormer model)
  - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [TFRobertaForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.TFRobertaForQuestionAnswering) (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForQuestionAnswering` (RoBERTa-PreLayerNorm model)
  - `XLMConfig` configuration class: `TFXLMForQuestionAnsweringSimple` (XLM model)
  - `XLMRobertaConfig` configuration class: `TFXLMRobertaForQuestionAnswering` (XLM-RoBERTa model)
  - `XLNetConfig` configuration class: `TFXLNetForQuestionAnsweringSimple` (XLNet model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a question answering head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForQuestionAnswering.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `TFAlbertForQuestionAnswering` (ALBERT model) - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [TFBertForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertForQuestionAnswering) (BERT model) - `CamembertConfig` configuration class: `TFCamembertForQuestionAnswering` (CamemBERT model) - [ConvBertConfig](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.ConvBertConfig) configuration class: [TFConvBertForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.TFConvBertForQuestionAnswering) (ConvBERT model) - [DebertaConfig](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.DebertaConfig) configuration class: [TFDebertaForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.TFDebertaForQuestionAnswering) (DeBERTa model) - [DebertaV2Config](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [TFDebertaV2ForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.TFDebertaV2ForQuestionAnswering) (DeBERTa-v2 model) - `DistilBertConfig` configuration class: `TFDistilBertForQuestionAnswering` (DistilBERT model) - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [TFElectraForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.TFElectraForQuestionAnswering) (ELECTRA model) - `FlaubertConfig` configuration class: `TFFlaubertForQuestionAnsweringSimple` (FlauBERT model) - `FunnelConfig` configuration class: `TFFunnelForQuestionAnswering` (Funnel Transformer model) - `GPTJConfig` configuration class: `TFGPTJForQuestionAnswering` (GPT-J model) - `LayoutLMv3Config` configuration class: `TFLayoutLMv3ForQuestionAnswering` (LayoutLMv3 model) - `LongformerConfig` configuration class: `TFLongformerForQuestionAnswering` (Longformer model) - `MPNetConfig` configuration class: `TFMPNetForQuestionAnswering` (MPNet model) - `MobileBertConfig` configuration class: `TFMobileBertForQuestionAnswering` (MobileBERT model) - `RemBertConfig` configuration class: `TFRemBertForQuestionAnswering` (RemBERT model) - `RoFormerConfig` configuration class: `TFRoFormerForQuestionAnswering` (RoFormer model) - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [TFRobertaForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.TFRobertaForQuestionAnswering) (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `TFRobertaPreLayerNormForQuestionAnswering` (RoBERTa-PreLayerNorm model) - `XLMConfig` configuration class: `TFXLMForQuestionAnsweringSimple` (XLM model) - `XLMRobertaConfig` configuration class: `TFXLMRobertaForQuestionAnswering` (XLM-RoBERTa model) - `XLNetConfig` configuration class: `TFXLNetForQuestionAnsweringSimple` (XLNet model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForQuestionAnswering.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a question answering head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- `TFAlbertForQuestionAnswering` (ALBERT model)
- **bert** -- [TFBertForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.TFBertForQuestionAnswering) (BERT model)
- **camembert** -- `TFCamembertForQuestionAnswering` (CamemBERT model)
- **convbert** -- [TFConvBertForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/convbert#transformers.TFConvBertForQuestionAnswering) (ConvBERT model)
- **deberta** -- [TFDebertaForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/deberta#transformers.TFDebertaForQuestionAnswering) (DeBERTa model)
- **deberta-v2** -- [TFDebertaV2ForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/deberta-v2#transformers.TFDebertaV2ForQuestionAnswering) (DeBERTa-v2 model)
- **distilbert** -- `TFDistilBertForQuestionAnswering` (DistilBERT model)
- **electra** -- [TFElectraForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.TFElectraForQuestionAnswering) (ELECTRA model)
- **flaubert** -- `TFFlaubertForQuestionAnsweringSimple` (FlauBERT model)
- **funnel** -- `TFFunnelForQuestionAnswering` (Funnel Transformer model)
- **gptj** -- `TFGPTJForQuestionAnswering` (GPT-J model)
- **layoutlmv3** -- `TFLayoutLMv3ForQuestionAnswering` (LayoutLMv3 model)
- **longformer** -- `TFLongformerForQuestionAnswering` (Longformer model)
- **mobilebert** -- `TFMobileBertForQuestionAnswering` (MobileBERT model)
- **mpnet** -- `TFMPNetForQuestionAnswering` (MPNet model)
- **rembert** -- `TFRemBertForQuestionAnswering` (RemBERT model)
- **roberta** -- [TFRobertaForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.TFRobertaForQuestionAnswering) (RoBERTa model)
- **roberta-prelayernorm** -- `TFRobertaPreLayerNormForQuestionAnswering` (RoBERTa-PreLayerNorm model)
- **roformer** -- `TFRoFormerForQuestionAnswering` (RoFormer model)
- **xlm** -- `TFXLMForQuestionAnsweringSimple` (XLM model)
- **xlm-roberta** -- `TFXLMRobertaForQuestionAnswering` (XLM-RoBERTa model)
- **xlnet** -- `TFXLNetForQuestionAnsweringSimple` (XLNet model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForQuestionAnswering.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### FlaxAutoModelForQuestionAnswering[[transformers.FlaxAutoModelForQuestionAnswering]][[transformers.FlaxAutoModelForQuestionAnswering]]

#### transformers.FlaxAutoModelForQuestionAnswering[[transformers.FlaxAutoModelForQuestionAnswering]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_flax_auto.py#L329)

This is a generic model class that will be instantiated as one of the model classes of the library (with a question answering head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.FlaxAutoModelForQuestionAnswering.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `FlaxAlbertForQuestionAnswering` (ALBERT model)
  - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.FlaxBartForQuestionAnswering) (BART model)
  - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertForQuestionAnswering) (BERT model)
  - `BigBirdConfig` configuration class: `FlaxBigBirdForQuestionAnswering` (BigBird model)
  - `DistilBertConfig` configuration class: `FlaxDistilBertForQuestionAnswering` (DistilBERT model)
  - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [FlaxElectraForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.FlaxElectraForQuestionAnswering) (ELECTRA model)
  - `MBartConfig` configuration class: `FlaxMBartForQuestionAnswering` (mBART model)
  - `RoFormerConfig` configuration class: `FlaxRoFormerForQuestionAnswering` (RoFormer model)
  - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [FlaxRobertaForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.FlaxRobertaForQuestionAnswering) (RoBERTa model)
  - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForQuestionAnswering` (RoBERTa-PreLayerNorm model)
  - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForQuestionAnswering` (XLM-RoBERTa model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a question answering head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForQuestionAnswering.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v4.57.1/ko/model_doc/albert#transformers.AlbertConfig) configuration class: `FlaxAlbertForQuestionAnswering` (ALBERT model) - [BartConfig](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.BartConfig) configuration class: [FlaxBartForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.FlaxBartForQuestionAnswering) (BART model) - [BertConfig](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.BertConfig) configuration class: [FlaxBertForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertForQuestionAnswering) (BERT model) - `BigBirdConfig` configuration class: `FlaxBigBirdForQuestionAnswering` (BigBird model) - `DistilBertConfig` configuration class: `FlaxDistilBertForQuestionAnswering` (DistilBERT model) - [ElectraConfig](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.ElectraConfig) configuration class: [FlaxElectraForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.FlaxElectraForQuestionAnswering) (ELECTRA model) - `MBartConfig` configuration class: `FlaxMBartForQuestionAnswering` (mBART model) - `RoFormerConfig` configuration class: `FlaxRoFormerForQuestionAnswering` (RoFormer model) - [RobertaConfig](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.RobertaConfig) configuration class: [FlaxRobertaForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.FlaxRobertaForQuestionAnswering) (RoBERTa model) - `RobertaPreLayerNormConfig` configuration class: `FlaxRobertaPreLayerNormForQuestionAnswering` (RoBERTa-PreLayerNorm model) - `XLMRobertaConfig` configuration class: `FlaxXLMRobertaForQuestionAnswering` (XLM-RoBERTa model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.FlaxAutoModelForQuestionAnswering.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a question answering head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- `FlaxAlbertForQuestionAnswering` (ALBERT model)
- **bart** -- [FlaxBartForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/bart#transformers.FlaxBartForQuestionAnswering) (BART model)
- **bert** -- [FlaxBertForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/bert#transformers.FlaxBertForQuestionAnswering) (BERT model)
- **big_bird** -- `FlaxBigBirdForQuestionAnswering` (BigBird model)
- **distilbert** -- `FlaxDistilBertForQuestionAnswering` (DistilBERT model)
- **electra** -- [FlaxElectraForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/electra#transformers.FlaxElectraForQuestionAnswering) (ELECTRA model)
- **mbart** -- `FlaxMBartForQuestionAnswering` (mBART model)
- **roberta** -- [FlaxRobertaForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/roberta#transformers.FlaxRobertaForQuestionAnswering) (RoBERTa model)
- **roberta-prelayernorm** -- `FlaxRobertaPreLayerNormForQuestionAnswering` (RoBERTa-PreLayerNorm model)
- **roformer** -- `FlaxRoFormerForQuestionAnswering` (RoFormer model)
- **xlm-roberta** -- `FlaxXLMRobertaForQuestionAnswering` (XLM-RoBERTa model)

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForQuestionAnswering.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForTextEncoding[[transformers.AutoModelForTextEncoding]][[transformers.AutoModelForTextEncoding]]

#### transformers.AutoModelForTextEncoding[[transformers.AutoModelForTextEncoding]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L1932)

### TFAutoModelForTextEncoding[[transformers.TFAutoModelForTextEncoding]][[transformers.TFAutoModelForTextEncoding]]

#### transformers.TFAutoModelForTextEncoding[[transformers.TFAutoModelForTextEncoding]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L534)

## 컴퓨터 비전[[computer-vision]]

다음 자동 클래스들은 아래의 컴퓨터 비전 작업에 사용할 수 있습니다.

### AutoModelForDepthEstimation[[transformers.AutoModelForDepthEstimation]][[transformers.AutoModelForDepthEstimation]]

#### transformers.AutoModelForDepthEstimation[[transformers.AutoModelForDepthEstimation]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2144)

This is a generic model class that will be instantiated as one of the model classes of the library (with a depth estimation head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForDepthEstimation.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `DPTConfig` configuration class: `DPTForDepthEstimation` (DPT model)
  - `DepthAnythingConfig` configuration class: `DepthAnythingForDepthEstimation` (Depth Anything model)
  - `DepthProConfig` configuration class: `DepthProForDepthEstimation` (DepthPro model)
  - `GLPNConfig` configuration class: `GLPNForDepthEstimation` (GLPN model)
  - `PromptDepthAnythingConfig` configuration class: `PromptDepthAnythingForDepthEstimation` (PromptDepthAnything model)
  - `ZoeDepthConfig` configuration class: `ZoeDepthForDepthEstimation` (ZoeDepth model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a depth estimation head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForDepthEstimation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForDepthEstimation.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `DPTConfig` configuration class: `DPTForDepthEstimation` (DPT model) - `DepthAnythingConfig` configuration class: `DepthAnythingForDepthEstimation` (Depth Anything model) - `DepthProConfig` configuration class: `DepthProForDepthEstimation` (DepthPro model) - `GLPNConfig` configuration class: `GLPNForDepthEstimation` (GLPN model) - `PromptDepthAnythingConfig` configuration class: `PromptDepthAnythingForDepthEstimation` (PromptDepthAnything model) - `ZoeDepthConfig` configuration class: `ZoeDepthForDepthEstimation` (ZoeDepth model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForDepthEstimation.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a depth estimation head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **depth_anything** -- `DepthAnythingForDepthEstimation` (Depth Anything model)
- **depth_pro** -- `DepthProForDepthEstimation` (DepthPro model)
- **dpt** -- `DPTForDepthEstimation` (DPT model)
- **glpn** -- `GLPNForDepthEstimation` (GLPN model)
- **prompt_depth_anything** -- `PromptDepthAnythingForDepthEstimation` (PromptDepthAnything model)
- **zoedepth** -- `ZoeDepthForDepthEstimation` (ZoeDepth model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForDepthEstimation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForDepthEstimation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForDepthEstimation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForDepthEstimation.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForImageClassification[[transformers.AutoModelForImageClassification]][[transformers.AutoModelForImageClassification]]

#### transformers.AutoModelForImageClassification[[transformers.AutoModelForImageClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2069)

This is a generic model class that will be instantiated as one of the model classes of the library (with a image classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForImageClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `BeitConfig` configuration class: `BeitForImageClassification` (BEiT model)
  - `BitConfig` configuration class: `BitForImageClassification` (BiT model)
  - [CLIPConfig](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPConfig) configuration class: [CLIPForImageClassification](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPForImageClassification) (CLIP model)
  - `ConvNextConfig` configuration class: `ConvNextForImageClassification` (ConvNeXT model)
  - `ConvNextV2Config` configuration class: `ConvNextV2ForImageClassification` (ConvNeXTV2 model)
  - `CvtConfig` configuration class: `CvtForImageClassification` (CvT model)
  - `Data2VecVisionConfig` configuration class: `Data2VecVisionForImageClassification` (Data2VecVision model)
  - `DeiTConfig` configuration class: `DeiTForImageClassification` or `DeiTForImageClassificationWithTeacher` (DeiT model)
  - `DinatConfig` configuration class: `DinatForImageClassification` (DiNAT model)
  - `Dinov2Config` configuration class: `Dinov2ForImageClassification` (DINOv2 model)
  - `Dinov2WithRegistersConfig` configuration class: `Dinov2WithRegistersForImageClassification` (DINOv2 with Registers model)
  - `DonutSwinConfig` configuration class: `DonutSwinForImageClassification` (DonutSwin model)
  - `EfficientFormerConfig` configuration class: `EfficientFormerForImageClassification` or `EfficientFormerForImageClassificationWithTeacher` (EfficientFormer model)
  - `EfficientNetConfig` configuration class: `EfficientNetForImageClassification` (EfficientNet model)
  - `FocalNetConfig` configuration class: `FocalNetForImageClassification` (FocalNet model)
  - `HGNetV2Config` configuration class: `HGNetV2ForImageClassification` (HGNet-V2 model)
  - `HieraConfig` configuration class: `HieraForImageClassification` (Hiera model)
  - `IJepaConfig` configuration class: `IJepaForImageClassification` (I-JEPA model)
  - `ImageGPTConfig` configuration class: `ImageGPTForImageClassification` (ImageGPT model)
  - `LevitConfig` configuration class: `LevitForImageClassification` or `LevitForImageClassificationWithTeacher` (LeViT model)
  - `MetaClip2Config` configuration class: `MetaClip2ForImageClassification` (MetaCLIP 2 model)
  - `MobileNetV1Config` configuration class: `MobileNetV1ForImageClassification` (MobileNetV1 model)
  - `MobileNetV2Config` configuration class: `MobileNetV2ForImageClassification` (MobileNetV2 model)
  - `MobileViTConfig` configuration class: `MobileViTForImageClassification` (MobileViT model)
  - `MobileViTV2Config` configuration class: `MobileViTV2ForImageClassification` (MobileViTV2 model)
  - `NatConfig` configuration class: `NatForImageClassification` (NAT model)
  - `PerceiverConfig` configuration class: `PerceiverForImageClassificationLearned` or `PerceiverForImageClassificationFourier` or `PerceiverForImageClassificationConvProcessing` (Perceiver model)
  - `PoolFormerConfig` configuration class: `PoolFormerForImageClassification` (PoolFormer model)
  - `PvtConfig` configuration class: `PvtForImageClassification` (PVT model)
  - `PvtV2Config` configuration class: `PvtV2ForImageClassification` (PVTv2 model)
  - `RegNetConfig` configuration class: `RegNetForImageClassification` (RegNet model)
  - `ResNetConfig` configuration class: `ResNetForImageClassification` (ResNet model)
  - `SegformerConfig` configuration class: `SegformerForImageClassification` (SegFormer model)
  - `ShieldGemma2Config` configuration class: `ShieldGemma2ForImageClassification` (Shieldgemma2 model)
  - `Siglip2Config` configuration class: `Siglip2ForImageClassification` (SigLIP2 model)
  - [SiglipConfig](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipConfig) configuration class: [SiglipForImageClassification](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipForImageClassification) (SigLIP model)
  - `SwiftFormerConfig` configuration class: `SwiftFormerForImageClassification` (SwiftFormer model)
  - [SwinConfig](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.SwinConfig) configuration class: [SwinForImageClassification](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.SwinForImageClassification) (Swin Transformer model)
  - [Swinv2Config](/docs/transformers/v4.57.1/ko/model_doc/swinv2#transformers.Swinv2Config) configuration class: [Swinv2ForImageClassification](/docs/transformers/v4.57.1/ko/model_doc/swinv2#transformers.Swinv2ForImageClassification) (Swin Transformer V2 model)
  - `TextNetConfig` configuration class: `TextNetForImageClassification` (TextNet model)
  - `TimmWrapperConfig` configuration class: `TimmWrapperForImageClassification` (TimmWrapperModel model)
  - `VanConfig` configuration class: `VanForImageClassification` (VAN model)
  - [ViTConfig](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTConfig) configuration class: [ViTForImageClassification](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTForImageClassification) (ViT model)
  - `ViTHybridConfig` configuration class: `ViTHybridForImageClassification` (ViT Hybrid model)
  - `ViTMSNConfig` configuration class: `ViTMSNForImageClassification` (ViTMSN model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a image classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForImageClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForImageClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `BeitConfig` configuration class: `BeitForImageClassification` (BEiT model) - `BitConfig` configuration class: `BitForImageClassification` (BiT model) - [CLIPConfig](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPConfig) configuration class: [CLIPForImageClassification](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPForImageClassification) (CLIP model) - `ConvNextConfig` configuration class: `ConvNextForImageClassification` (ConvNeXT model) - `ConvNextV2Config` configuration class: `ConvNextV2ForImageClassification` (ConvNeXTV2 model) - `CvtConfig` configuration class: `CvtForImageClassification` (CvT model) - `Data2VecVisionConfig` configuration class: `Data2VecVisionForImageClassification` (Data2VecVision model) - `DeiTConfig` configuration class: `DeiTForImageClassification` or `DeiTForImageClassificationWithTeacher` (DeiT model) - `DinatConfig` configuration class: `DinatForImageClassification` (DiNAT model) - `Dinov2Config` configuration class: `Dinov2ForImageClassification` (DINOv2 model) - `Dinov2WithRegistersConfig` configuration class: `Dinov2WithRegistersForImageClassification` (DINOv2 with Registers model) - `DonutSwinConfig` configuration class: `DonutSwinForImageClassification` (DonutSwin model) - `EfficientFormerConfig` configuration class: `EfficientFormerForImageClassification` or `EfficientFormerForImageClassificationWithTeacher` (EfficientFormer model) - `EfficientNetConfig` configuration class: `EfficientNetForImageClassification` (EfficientNet model) - `FocalNetConfig` configuration class: `FocalNetForImageClassification` (FocalNet model) - `HGNetV2Config` configuration class: `HGNetV2ForImageClassification` (HGNet-V2 model) - `HieraConfig` configuration class: `HieraForImageClassification` (Hiera model) - `IJepaConfig` configuration class: `IJepaForImageClassification` (I-JEPA model) - `ImageGPTConfig` configuration class: `ImageGPTForImageClassification` (ImageGPT model) - `LevitConfig` configuration class: `LevitForImageClassification` or `LevitForImageClassificationWithTeacher` (LeViT model) - `MetaClip2Config` configuration class: `MetaClip2ForImageClassification` (MetaCLIP 2 model) - `MobileNetV1Config` configuration class: `MobileNetV1ForImageClassification` (MobileNetV1 model) - `MobileNetV2Config` configuration class: `MobileNetV2ForImageClassification` (MobileNetV2 model) - `MobileViTConfig` configuration class: `MobileViTForImageClassification` (MobileViT model) - `MobileViTV2Config` configuration class: `MobileViTV2ForImageClassification` (MobileViTV2 model) - `NatConfig` configuration class: `NatForImageClassification` (NAT model) - `PerceiverConfig` configuration class: `PerceiverForImageClassificationLearned` or `PerceiverForImageClassificationFourier` or `PerceiverForImageClassificationConvProcessing` (Perceiver model) - `PoolFormerConfig` configuration class: `PoolFormerForImageClassification` (PoolFormer model) - `PvtConfig` configuration class: `PvtForImageClassification` (PVT model) - `PvtV2Config` configuration class: `PvtV2ForImageClassification` (PVTv2 model) - `RegNetConfig` configuration class: `RegNetForImageClassification` (RegNet model) - `ResNetConfig` configuration class: `ResNetForImageClassification` (ResNet model) - `SegformerConfig` configuration class: `SegformerForImageClassification` (SegFormer model) - `ShieldGemma2Config` configuration class: `ShieldGemma2ForImageClassification` (Shieldgemma2 model) - `Siglip2Config` configuration class: `Siglip2ForImageClassification` (SigLIP2 model) - [SiglipConfig](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipConfig) configuration class: [SiglipForImageClassification](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipForImageClassification) (SigLIP model) - `SwiftFormerConfig` configuration class: `SwiftFormerForImageClassification` (SwiftFormer model) - [SwinConfig](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.SwinConfig) configuration class: [SwinForImageClassification](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.SwinForImageClassification) (Swin Transformer model) - [Swinv2Config](/docs/transformers/v4.57.1/ko/model_doc/swinv2#transformers.Swinv2Config) configuration class: [Swinv2ForImageClassification](/docs/transformers/v4.57.1/ko/model_doc/swinv2#transformers.Swinv2ForImageClassification) (Swin Transformer V2 model) - `TextNetConfig` configuration class: `TextNetForImageClassification` (TextNet model) - `TimmWrapperConfig` configuration class: `TimmWrapperForImageClassification` (TimmWrapperModel model) - `VanConfig` configuration class: `VanForImageClassification` (VAN model) - [ViTConfig](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTConfig) configuration class: [ViTForImageClassification](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTForImageClassification) (ViT model) - `ViTHybridConfig` configuration class: `ViTHybridForImageClassification` (ViT Hybrid model) - `ViTMSNConfig` configuration class: `ViTMSNForImageClassification` (ViTMSN model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForImageClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a image classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **beit** -- `BeitForImageClassification` (BEiT model)
- **bit** -- `BitForImageClassification` (BiT model)
- **clip** -- [CLIPForImageClassification](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPForImageClassification) (CLIP model)
- **convnext** -- `ConvNextForImageClassification` (ConvNeXT model)
- **convnextv2** -- `ConvNextV2ForImageClassification` (ConvNeXTV2 model)
- **cvt** -- `CvtForImageClassification` (CvT model)
- **data2vec-vision** -- `Data2VecVisionForImageClassification` (Data2VecVision model)
- **deit** -- `DeiTForImageClassification` or `DeiTForImageClassificationWithTeacher` (DeiT model)
- **dinat** -- `DinatForImageClassification` (DiNAT model)
- **dinov2** -- `Dinov2ForImageClassification` (DINOv2 model)
- **dinov2_with_registers** -- `Dinov2WithRegistersForImageClassification` (DINOv2 with Registers model)
- **donut-swin** -- `DonutSwinForImageClassification` (DonutSwin model)
- **efficientformer** -- `EfficientFormerForImageClassification` or `EfficientFormerForImageClassificationWithTeacher` (EfficientFormer model)
- **efficientnet** -- `EfficientNetForImageClassification` (EfficientNet model)
- **focalnet** -- `FocalNetForImageClassification` (FocalNet model)
- **hgnet_v2** -- `HGNetV2ForImageClassification` (HGNet-V2 model)
- **hiera** -- `HieraForImageClassification` (Hiera model)
- **ijepa** -- `IJepaForImageClassification` (I-JEPA model)
- **imagegpt** -- `ImageGPTForImageClassification` (ImageGPT model)
- **levit** -- `LevitForImageClassification` or `LevitForImageClassificationWithTeacher` (LeViT model)
- **metaclip_2** -- `MetaClip2ForImageClassification` (MetaCLIP 2 model)
- **mobilenet_v1** -- `MobileNetV1ForImageClassification` (MobileNetV1 model)
- **mobilenet_v2** -- `MobileNetV2ForImageClassification` (MobileNetV2 model)
- **mobilevit** -- `MobileViTForImageClassification` (MobileViT model)
- **mobilevitv2** -- `MobileViTV2ForImageClassification` (MobileViTV2 model)
- **nat** -- `NatForImageClassification` (NAT model)
- **perceiver** -- `PerceiverForImageClassificationLearned` or `PerceiverForImageClassificationFourier` or `PerceiverForImageClassificationConvProcessing` (Perceiver model)
- **poolformer** -- `PoolFormerForImageClassification` (PoolFormer model)
- **pvt** -- `PvtForImageClassification` (PVT model)
- **pvt_v2** -- `PvtV2ForImageClassification` (PVTv2 model)
- **regnet** -- `RegNetForImageClassification` (RegNet model)
- **resnet** -- `ResNetForImageClassification` (ResNet model)
- **segformer** -- `SegformerForImageClassification` (SegFormer model)
- **shieldgemma2** -- `ShieldGemma2ForImageClassification` (Shieldgemma2 model)
- **siglip** -- [SiglipForImageClassification](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipForImageClassification) (SigLIP model)
- **siglip2** -- `Siglip2ForImageClassification` (SigLIP2 model)
- **swiftformer** -- `SwiftFormerForImageClassification` (SwiftFormer model)
- **swin** -- [SwinForImageClassification](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.SwinForImageClassification) (Swin Transformer model)
- **swinv2** -- [Swinv2ForImageClassification](/docs/transformers/v4.57.1/ko/model_doc/swinv2#transformers.Swinv2ForImageClassification) (Swin Transformer V2 model)
- **textnet** -- `TextNetForImageClassification` (TextNet model)
- **timm_wrapper** -- `TimmWrapperForImageClassification` (TimmWrapperModel model)
- **van** -- `VanForImageClassification` (VAN model)
- **vit** -- [ViTForImageClassification](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTForImageClassification) (ViT model)
- **vit_hybrid** -- `ViTHybridForImageClassification` (ViT Hybrid model)
- **vit_msn** -- `ViTMSNForImageClassification` (ViTMSN model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForImageClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForImageClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForImageClassification[[transformers.TFAutoModelForImageClassification]][[transformers.TFAutoModelForImageClassification]]

#### transformers.TFAutoModelForImageClassification[[transformers.TFAutoModelForImageClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L585)

This is a generic model class that will be instantiated as one of the model classes of the library (with a image classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForImageClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `ConvNextConfig` configuration class: `TFConvNextForImageClassification` (ConvNeXT model)
  - `ConvNextV2Config` configuration class: `TFConvNextV2ForImageClassification` (ConvNeXTV2 model)
  - `CvtConfig` configuration class: `TFCvtForImageClassification` (CvT model)
  - `Data2VecVisionConfig` configuration class: `TFData2VecVisionForImageClassification` (Data2VecVision model)
  - `DeiTConfig` configuration class: `TFDeiTForImageClassification` or `TFDeiTForImageClassificationWithTeacher` (DeiT model)
  - `EfficientFormerConfig` configuration class: `TFEfficientFormerForImageClassification` or `TFEfficientFormerForImageClassificationWithTeacher` (EfficientFormer model)
  - `MobileViTConfig` configuration class: `TFMobileViTForImageClassification` (MobileViT model)
  - `RegNetConfig` configuration class: `TFRegNetForImageClassification` (RegNet model)
  - `ResNetConfig` configuration class: `TFResNetForImageClassification` (ResNet model)
  - `SegformerConfig` configuration class: `TFSegformerForImageClassification` (SegFormer model)
  - `SwiftFormerConfig` configuration class: `TFSwiftFormerForImageClassification` (SwiftFormer model)
  - [SwinConfig](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.SwinConfig) configuration class: [TFSwinForImageClassification](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.TFSwinForImageClassification) (Swin Transformer model)
  - [ViTConfig](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTConfig) configuration class: [TFViTForImageClassification](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.TFViTForImageClassification) (ViT model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a image classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForImageClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForImageClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `ConvNextConfig` configuration class: `TFConvNextForImageClassification` (ConvNeXT model) - `ConvNextV2Config` configuration class: `TFConvNextV2ForImageClassification` (ConvNeXTV2 model) - `CvtConfig` configuration class: `TFCvtForImageClassification` (CvT model) - `Data2VecVisionConfig` configuration class: `TFData2VecVisionForImageClassification` (Data2VecVision model) - `DeiTConfig` configuration class: `TFDeiTForImageClassification` or `TFDeiTForImageClassificationWithTeacher` (DeiT model) - `EfficientFormerConfig` configuration class: `TFEfficientFormerForImageClassification` or `TFEfficientFormerForImageClassificationWithTeacher` (EfficientFormer model) - `MobileViTConfig` configuration class: `TFMobileViTForImageClassification` (MobileViT model) - `RegNetConfig` configuration class: `TFRegNetForImageClassification` (RegNet model) - `ResNetConfig` configuration class: `TFResNetForImageClassification` (ResNet model) - `SegformerConfig` configuration class: `TFSegformerForImageClassification` (SegFormer model) - `SwiftFormerConfig` configuration class: `TFSwiftFormerForImageClassification` (SwiftFormer model) - [SwinConfig](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.SwinConfig) configuration class: [TFSwinForImageClassification](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.TFSwinForImageClassification) (Swin Transformer model) - [ViTConfig](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTConfig) configuration class: [TFViTForImageClassification](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.TFViTForImageClassification) (ViT model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForImageClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a image classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **convnext** -- `TFConvNextForImageClassification` (ConvNeXT model)
- **convnextv2** -- `TFConvNextV2ForImageClassification` (ConvNeXTV2 model)
- **cvt** -- `TFCvtForImageClassification` (CvT model)
- **data2vec-vision** -- `TFData2VecVisionForImageClassification` (Data2VecVision model)
- **deit** -- `TFDeiTForImageClassification` or `TFDeiTForImageClassificationWithTeacher` (DeiT model)
- **efficientformer** -- `TFEfficientFormerForImageClassification` or `TFEfficientFormerForImageClassificationWithTeacher` (EfficientFormer model)
- **mobilevit** -- `TFMobileViTForImageClassification` (MobileViT model)
- **regnet** -- `TFRegNetForImageClassification` (RegNet model)
- **resnet** -- `TFResNetForImageClassification` (ResNet model)
- **segformer** -- `TFSegformerForImageClassification` (SegFormer model)
- **swiftformer** -- `TFSwiftFormerForImageClassification` (SwiftFormer model)
- **swin** -- [TFSwinForImageClassification](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.TFSwinForImageClassification) (Swin Transformer model)
- **vit** -- [TFViTForImageClassification](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.TFViTForImageClassification) (ViT model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForImageClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForImageClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### FlaxAutoModelForImageClassification[[transformers.FlaxAutoModelForImageClassification]][[transformers.FlaxAutoModelForImageClassification]]

#### transformers.FlaxAutoModelForImageClassification[[transformers.FlaxAutoModelForImageClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_flax_auto.py#L361)

This is a generic model class that will be instantiated as one of the model classes of the library (with a image classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.FlaxAutoModelForImageClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `BeitConfig` configuration class: `FlaxBeitForImageClassification` (BEiT model)
  - `Dinov2Config` configuration class: `FlaxDinov2ForImageClassification` (DINOv2 model)
  - `RegNetConfig` configuration class: `FlaxRegNetForImageClassification` (RegNet model)
  - `ResNetConfig` configuration class: `FlaxResNetForImageClassification` (ResNet model)
  - [ViTConfig](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTConfig) configuration class: [FlaxViTForImageClassification](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.FlaxViTForImageClassification) (ViT model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a image classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForImageClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForImageClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `BeitConfig` configuration class: `FlaxBeitForImageClassification` (BEiT model) - `Dinov2Config` configuration class: `FlaxDinov2ForImageClassification` (DINOv2 model) - `RegNetConfig` configuration class: `FlaxRegNetForImageClassification` (RegNet model) - `ResNetConfig` configuration class: `FlaxResNetForImageClassification` (ResNet model) - [ViTConfig](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTConfig) configuration class: [FlaxViTForImageClassification](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.FlaxViTForImageClassification) (ViT model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.FlaxAutoModelForImageClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a image classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **beit** -- `FlaxBeitForImageClassification` (BEiT model)
- **dinov2** -- `FlaxDinov2ForImageClassification` (DINOv2 model)
- **regnet** -- `FlaxRegNetForImageClassification` (RegNet model)
- **resnet** -- `FlaxResNetForImageClassification` (ResNet model)
- **vit** -- [FlaxViTForImageClassification](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.FlaxViTForImageClassification) (ViT model)

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForImageClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForImageClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForVideoClassification[[transformers.AutoModelForVideoClassification]][[transformers.AutoModelForVideoClassification]]

#### transformers.AutoModelForVideoClassification[[transformers.AutoModelForVideoClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2151)

This is a generic model class that will be instantiated as one of the model classes of the library (with a video classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForVideoClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [TimesformerConfig](/docs/transformers/v4.57.1/ko/model_doc/timesformer#transformers.TimesformerConfig) configuration class: [TimesformerForVideoClassification](/docs/transformers/v4.57.1/ko/model_doc/timesformer#transformers.TimesformerForVideoClassification) (TimeSformer model)
  - `VJEPA2Config` configuration class: `VJEPA2ForVideoClassification` (VJEPA2Model model)
  - `VideoMAEConfig` configuration class: `VideoMAEForVideoClassification` (VideoMAE model)
  - [VivitConfig](/docs/transformers/v4.57.1/ko/model_doc/vivit#transformers.VivitConfig) configuration class: [VivitForVideoClassification](/docs/transformers/v4.57.1/ko/model_doc/vivit#transformers.VivitForVideoClassification) (ViViT model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a video classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForVideoClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForVideoClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [TimesformerConfig](/docs/transformers/v4.57.1/ko/model_doc/timesformer#transformers.TimesformerConfig) configuration class: [TimesformerForVideoClassification](/docs/transformers/v4.57.1/ko/model_doc/timesformer#transformers.TimesformerForVideoClassification) (TimeSformer model) - `VJEPA2Config` configuration class: `VJEPA2ForVideoClassification` (VJEPA2Model model) - `VideoMAEConfig` configuration class: `VideoMAEForVideoClassification` (VideoMAE model) - [VivitConfig](/docs/transformers/v4.57.1/ko/model_doc/vivit#transformers.VivitConfig) configuration class: [VivitForVideoClassification](/docs/transformers/v4.57.1/ko/model_doc/vivit#transformers.VivitForVideoClassification) (ViViT model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForVideoClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a video classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **timesformer** -- [TimesformerForVideoClassification](/docs/transformers/v4.57.1/ko/model_doc/timesformer#transformers.TimesformerForVideoClassification) (TimeSformer model)
- **videomae** -- `VideoMAEForVideoClassification` (VideoMAE model)
- **vivit** -- [VivitForVideoClassification](/docs/transformers/v4.57.1/ko/model_doc/vivit#transformers.VivitForVideoClassification) (ViViT model)
- **vjepa2** -- `VJEPA2ForVideoClassification` (VJEPA2Model model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForVideoClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForVideoClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForVideoClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForVideoClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForKeypointDetection[[transformers.AutoModelForKeypointDetection]][[transformers.AutoModelForKeypointDetection]]

#### transformers.AutoModelForKeypointDetection[[transformers.AutoModelForKeypointDetection]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L1924)

### AutoModelForMaskedImageModeling[[transformers.AutoModelForMaskedImageModeling]][[transformers.AutoModelForMaskedImageModeling]]

#### transformers.AutoModelForMaskedImageModeling[[transformers.AutoModelForMaskedImageModeling]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2234)

This is a generic model class that will be instantiated as one of the model classes of the library (with a masked image modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForMaskedImageModeling.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `DeiTConfig` configuration class: `DeiTForMaskedImageModeling` (DeiT model)
  - `FocalNetConfig` configuration class: `FocalNetForMaskedImageModeling` (FocalNet model)
  - [SwinConfig](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.SwinConfig) configuration class: [SwinForMaskedImageModeling](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.SwinForMaskedImageModeling) (Swin Transformer model)
  - [Swinv2Config](/docs/transformers/v4.57.1/ko/model_doc/swinv2#transformers.Swinv2Config) configuration class: [Swinv2ForMaskedImageModeling](/docs/transformers/v4.57.1/ko/model_doc/swinv2#transformers.Swinv2ForMaskedImageModeling) (Swin Transformer V2 model)
  - [ViTConfig](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTConfig) configuration class: [ViTForMaskedImageModeling](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTForMaskedImageModeling) (ViT model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a masked image modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForMaskedImageModeling

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForMaskedImageModeling.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `DeiTConfig` configuration class: `DeiTForMaskedImageModeling` (DeiT model) - `FocalNetConfig` configuration class: `FocalNetForMaskedImageModeling` (FocalNet model) - [SwinConfig](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.SwinConfig) configuration class: [SwinForMaskedImageModeling](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.SwinForMaskedImageModeling) (Swin Transformer model) - [Swinv2Config](/docs/transformers/v4.57.1/ko/model_doc/swinv2#transformers.Swinv2Config) configuration class: [Swinv2ForMaskedImageModeling](/docs/transformers/v4.57.1/ko/model_doc/swinv2#transformers.Swinv2ForMaskedImageModeling) (Swin Transformer V2 model) - [ViTConfig](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTConfig) configuration class: [ViTForMaskedImageModeling](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTForMaskedImageModeling) (ViT model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForMaskedImageModeling.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a masked image modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **deit** -- `DeiTForMaskedImageModeling` (DeiT model)
- **focalnet** -- `FocalNetForMaskedImageModeling` (FocalNet model)
- **swin** -- [SwinForMaskedImageModeling](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.SwinForMaskedImageModeling) (Swin Transformer model)
- **swinv2** -- [Swinv2ForMaskedImageModeling](/docs/transformers/v4.57.1/ko/model_doc/swinv2#transformers.Swinv2ForMaskedImageModeling) (Swin Transformer V2 model)
- **vit** -- [ViTForMaskedImageModeling](/docs/transformers/v4.57.1/ko/model_doc/vit#transformers.ViTForMaskedImageModeling) (ViT model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForMaskedImageModeling

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMaskedImageModeling.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForMaskedImageModeling.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForMaskedImageModeling.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForMaskedImageModeling[[transformers.TFAutoModelForMaskedImageModeling]][[transformers.TFAutoModelForMaskedImageModeling]]

#### transformers.TFAutoModelForMaskedImageModeling[[transformers.TFAutoModelForMaskedImageModeling]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L576)

This is a generic model class that will be instantiated as one of the model classes of the library (with a masked image modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForMaskedImageModeling.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `DeiTConfig` configuration class: `TFDeiTForMaskedImageModeling` (DeiT model)
  - [SwinConfig](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.SwinConfig) configuration class: [TFSwinForMaskedImageModeling](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.TFSwinForMaskedImageModeling) (Swin Transformer model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a masked image modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForMaskedImageModeling

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForMaskedImageModeling.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `DeiTConfig` configuration class: `TFDeiTForMaskedImageModeling` (DeiT model) - [SwinConfig](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.SwinConfig) configuration class: [TFSwinForMaskedImageModeling](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.TFSwinForMaskedImageModeling) (Swin Transformer model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForMaskedImageModeling.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a masked image modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **deit** -- `TFDeiTForMaskedImageModeling` (DeiT model)
- **swin** -- [TFSwinForMaskedImageModeling](/docs/transformers/v4.57.1/ko/model_doc/swin#transformers.TFSwinForMaskedImageModeling) (Swin Transformer model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForMaskedImageModeling

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForMaskedImageModeling.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForMaskedImageModeling.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForMaskedImageModeling.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForObjectDetection[[transformers.AutoModelForObjectDetection]][[transformers.AutoModelForObjectDetection]]

#### transformers.AutoModelForObjectDetection[[transformers.AutoModelForObjectDetection]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2128)

This is a generic model class that will be instantiated as one of the model classes of the library (with a object detection head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForObjectDetection.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `ConditionalDetrConfig` configuration class: `ConditionalDetrForObjectDetection` (Conditional DETR model)
  - `DFineConfig` configuration class: `DFineForObjectDetection` (D-FINE model)
  - `DabDetrConfig` configuration class: `DabDetrForObjectDetection` (DAB-DETR model)
  - `DeformableDetrConfig` configuration class: `DeformableDetrForObjectDetection` (Deformable DETR model)
  - `DetaConfig` configuration class: `DetaForObjectDetection` (DETA model)
  - `DetrConfig` configuration class: `DetrForObjectDetection` (DETR model)
  - `RTDetrConfig` configuration class: `RTDetrForObjectDetection` (RT-DETR model)
  - `RTDetrV2Config` configuration class: `RTDetrV2ForObjectDetection` (RT-DETRv2 model)
  - `TableTransformerConfig` configuration class: `TableTransformerForObjectDetection` (Table Transformer model)
  - `YolosConfig` configuration class: `YolosForObjectDetection` (YOLOS model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a object detection head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForObjectDetection

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForObjectDetection.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `ConditionalDetrConfig` configuration class: `ConditionalDetrForObjectDetection` (Conditional DETR model) - `DFineConfig` configuration class: `DFineForObjectDetection` (D-FINE model) - `DabDetrConfig` configuration class: `DabDetrForObjectDetection` (DAB-DETR model) - `DeformableDetrConfig` configuration class: `DeformableDetrForObjectDetection` (Deformable DETR model) - `DetaConfig` configuration class: `DetaForObjectDetection` (DETA model) - `DetrConfig` configuration class: `DetrForObjectDetection` (DETR model) - `RTDetrConfig` configuration class: `RTDetrForObjectDetection` (RT-DETR model) - `RTDetrV2Config` configuration class: `RTDetrV2ForObjectDetection` (RT-DETRv2 model) - `TableTransformerConfig` configuration class: `TableTransformerForObjectDetection` (Table Transformer model) - `YolosConfig` configuration class: `YolosForObjectDetection` (YOLOS model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForObjectDetection.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a object detection head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **conditional_detr** -- `ConditionalDetrForObjectDetection` (Conditional DETR model)
- **d_fine** -- `DFineForObjectDetection` (D-FINE model)
- **dab-detr** -- `DabDetrForObjectDetection` (DAB-DETR model)
- **deformable_detr** -- `DeformableDetrForObjectDetection` (Deformable DETR model)
- **deta** -- `DetaForObjectDetection` (DETA model)
- **detr** -- `DetrForObjectDetection` (DETR model)
- **rt_detr** -- `RTDetrForObjectDetection` (RT-DETR model)
- **rt_detr_v2** -- `RTDetrV2ForObjectDetection` (RT-DETRv2 model)
- **table-transformer** -- `TableTransformerForObjectDetection` (Table Transformer model)
- **yolos** -- `YolosForObjectDetection` (YOLOS model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForObjectDetection

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForObjectDetection.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForObjectDetection.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForObjectDetection.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForImageSegmentation[[transformers.AutoModelForImageSegmentation]][[transformers.AutoModelForImageSegmentation]]

#### transformers.AutoModelForImageSegmentation[[transformers.AutoModelForImageSegmentation]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2085)

This is a generic model class that will be instantiated as one of the model classes of the library (with a image segmentation head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForImageSegmentation.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `DetrConfig` configuration class: `DetrForSegmentation` (DETR model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a image segmentation head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForImageSegmentation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForImageSegmentation.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `DetrConfig` configuration class: `DetrForSegmentation` (DETR model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForImageSegmentation.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a image segmentation head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **detr** -- `DetrForSegmentation` (DETR model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForImageSegmentation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForImageSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForImageSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForImageSegmentation.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForImageToImage[[transformers.AutoModelForImageToImage]][[transformers.AutoModelForImageToImage]]

#### transformers.AutoModelForImageToImage[[transformers.AutoModelForImageToImage]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L1936)

### AutoModelForSemanticSegmentation[[transformers.AutoModelForSemanticSegmentation]][[transformers.AutoModelForSemanticSegmentation]]

#### transformers.AutoModelForSemanticSegmentation[[transformers.AutoModelForSemanticSegmentation]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2092)

This is a generic model class that will be instantiated as one of the model classes of the library (with a semantic segmentation head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForSemanticSegmentation.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `BeitConfig` configuration class: `BeitForSemanticSegmentation` (BEiT model)
  - `DPTConfig` configuration class: `DPTForSemanticSegmentation` (DPT model)
  - `Data2VecVisionConfig` configuration class: `Data2VecVisionForSemanticSegmentation` (Data2VecVision model)
  - `MobileNetV2Config` configuration class: `MobileNetV2ForSemanticSegmentation` (MobileNetV2 model)
  - `MobileViTConfig` configuration class: `MobileViTForSemanticSegmentation` (MobileViT model)
  - `MobileViTV2Config` configuration class: `MobileViTV2ForSemanticSegmentation` (MobileViTV2 model)
  - `SegformerConfig` configuration class: `SegformerForSemanticSegmentation` (SegFormer model)
  - `UperNetConfig` configuration class: `UperNetForSemanticSegmentation` (UPerNet model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a semantic segmentation head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForSemanticSegmentation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForSemanticSegmentation.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `BeitConfig` configuration class: `BeitForSemanticSegmentation` (BEiT model) - `DPTConfig` configuration class: `DPTForSemanticSegmentation` (DPT model) - `Data2VecVisionConfig` configuration class: `Data2VecVisionForSemanticSegmentation` (Data2VecVision model) - `MobileNetV2Config` configuration class: `MobileNetV2ForSemanticSegmentation` (MobileNetV2 model) - `MobileViTConfig` configuration class: `MobileViTForSemanticSegmentation` (MobileViT model) - `MobileViTV2Config` configuration class: `MobileViTV2ForSemanticSegmentation` (MobileViTV2 model) - `SegformerConfig` configuration class: `SegformerForSemanticSegmentation` (SegFormer model) - `UperNetConfig` configuration class: `UperNetForSemanticSegmentation` (UPerNet model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForSemanticSegmentation.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a semantic segmentation head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **beit** -- `BeitForSemanticSegmentation` (BEiT model)
- **data2vec-vision** -- `Data2VecVisionForSemanticSegmentation` (Data2VecVision model)
- **dpt** -- `DPTForSemanticSegmentation` (DPT model)
- **mobilenet_v2** -- `MobileNetV2ForSemanticSegmentation` (MobileNetV2 model)
- **mobilevit** -- `MobileViTForSemanticSegmentation` (MobileViT model)
- **mobilevitv2** -- `MobileViTV2ForSemanticSegmentation` (MobileViTV2 model)
- **segformer** -- `SegformerForSemanticSegmentation` (SegFormer model)
- **upernet** -- `UperNetForSemanticSegmentation` (UPerNet model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForSemanticSegmentation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSemanticSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForSemanticSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForSemanticSegmentation.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForSemanticSegmentation[[transformers.TFAutoModelForSemanticSegmentation]][[transformers.TFAutoModelForSemanticSegmentation]]

#### transformers.TFAutoModelForSemanticSegmentation[[transformers.TFAutoModelForSemanticSegmentation]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L603)

This is a generic model class that will be instantiated as one of the model classes of the library (with a semantic segmentation head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForSemanticSegmentation.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `Data2VecVisionConfig` configuration class: `TFData2VecVisionForSemanticSegmentation` (Data2VecVision model)
  - `MobileViTConfig` configuration class: `TFMobileViTForSemanticSegmentation` (MobileViT model)
  - `SegformerConfig` configuration class: `TFSegformerForSemanticSegmentation` (SegFormer model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a semantic segmentation head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForSemanticSegmentation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForSemanticSegmentation.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `Data2VecVisionConfig` configuration class: `TFData2VecVisionForSemanticSegmentation` (Data2VecVision model) - `MobileViTConfig` configuration class: `TFMobileViTForSemanticSegmentation` (MobileViT model) - `SegformerConfig` configuration class: `TFSegformerForSemanticSegmentation` (SegFormer model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForSemanticSegmentation.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a semantic segmentation head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **data2vec-vision** -- `TFData2VecVisionForSemanticSegmentation` (Data2VecVision model)
- **mobilevit** -- `TFMobileViTForSemanticSegmentation` (MobileViT model)
- **segformer** -- `TFSegformerForSemanticSegmentation` (SegFormer model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForSemanticSegmentation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForSemanticSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForSemanticSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForSemanticSegmentation.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForInstanceSegmentation[[transformers.AutoModelForInstanceSegmentation]][[transformers.AutoModelForInstanceSegmentation]]

#### transformers.AutoModelForInstanceSegmentation[[transformers.AutoModelForInstanceSegmentation]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2119)

This is a generic model class that will be instantiated as one of the model classes of the library (with a instance segmentation head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForInstanceSegmentation.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `MaskFormerConfig` configuration class: `MaskFormerForInstanceSegmentation` (MaskFormer model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a instance segmentation head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForInstanceSegmentation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForInstanceSegmentation.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `MaskFormerConfig` configuration class: `MaskFormerForInstanceSegmentation` (MaskFormer model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForInstanceSegmentation.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a instance segmentation head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **maskformer** -- `MaskFormerForInstanceSegmentation` (MaskFormer model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForInstanceSegmentation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForInstanceSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForInstanceSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForInstanceSegmentation.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForUniversalSegmentation[[transformers.AutoModelForUniversalSegmentation]][[transformers.AutoModelForUniversalSegmentation]]

#### transformers.AutoModelForUniversalSegmentation[[transformers.AutoModelForUniversalSegmentation]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2110)

This is a generic model class that will be instantiated as one of the model classes of the library (with a universal image segmentation head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForUniversalSegmentation.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `DetrConfig` configuration class: `DetrForSegmentation` (DETR model)
  - `EomtConfig` configuration class: `EomtForUniversalSegmentation` (EoMT model)
  - `Mask2FormerConfig` configuration class: `Mask2FormerForUniversalSegmentation` (Mask2Former model)
  - `MaskFormerConfig` configuration class: `MaskFormerForInstanceSegmentation` (MaskFormer model)
  - `OneFormerConfig` configuration class: `OneFormerForUniversalSegmentation` (OneFormer model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a universal image segmentation head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForUniversalSegmentation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForUniversalSegmentation.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `DetrConfig` configuration class: `DetrForSegmentation` (DETR model) - `EomtConfig` configuration class: `EomtForUniversalSegmentation` (EoMT model) - `Mask2FormerConfig` configuration class: `Mask2FormerForUniversalSegmentation` (Mask2Former model) - `MaskFormerConfig` configuration class: `MaskFormerForInstanceSegmentation` (MaskFormer model) - `OneFormerConfig` configuration class: `OneFormerForUniversalSegmentation` (OneFormer model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForUniversalSegmentation.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a universal image segmentation head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **detr** -- `DetrForSegmentation` (DETR model)
- **eomt** -- `EomtForUniversalSegmentation` (EoMT model)
- **mask2former** -- `Mask2FormerForUniversalSegmentation` (Mask2Former model)
- **maskformer** -- `MaskFormerForInstanceSegmentation` (MaskFormer model)
- **oneformer** -- `OneFormerForUniversalSegmentation` (OneFormer model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForUniversalSegmentation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForUniversalSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForUniversalSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForUniversalSegmentation.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForZeroShotImageClassification[[transformers.AutoModelForZeroShotImageClassification]][[transformers.AutoModelForZeroShotImageClassification]]

#### transformers.AutoModelForZeroShotImageClassification[[transformers.AutoModelForZeroShotImageClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2076)

This is a generic model class that will be instantiated as one of the model classes of the library (with a zero-shot image classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForZeroShotImageClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `AlignConfig` configuration class: `AlignModel` (ALIGN model)
  - [AltCLIPConfig](/docs/transformers/v4.57.1/ko/model_doc/altclip#transformers.AltCLIPConfig) configuration class: [AltCLIPModel](/docs/transformers/v4.57.1/ko/model_doc/altclip#transformers.AltCLIPModel) (AltCLIP model)
  - [Blip2Config](/docs/transformers/v4.57.1/ko/model_doc/blip-2#transformers.Blip2Config) configuration class: [Blip2ForImageTextRetrieval](/docs/transformers/v4.57.1/ko/model_doc/blip-2#transformers.Blip2ForImageTextRetrieval) (BLIP-2 model)
  - [BlipConfig](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.BlipConfig) configuration class: [BlipModel](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.BlipModel) (BLIP model)
  - [CLIPConfig](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPConfig) configuration class: [CLIPModel](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPModel) (CLIP model)
  - [CLIPSegConfig](/docs/transformers/v4.57.1/ko/model_doc/clipseg#transformers.CLIPSegConfig) configuration class: [CLIPSegModel](/docs/transformers/v4.57.1/ko/model_doc/clipseg#transformers.CLIPSegModel) (CLIPSeg model)
  - `ChineseCLIPConfig` configuration class: `ChineseCLIPModel` (Chinese-CLIP model)
  - `MetaClip2Config` configuration class: `MetaClip2Model` (MetaCLIP 2 model)
  - `Siglip2Config` configuration class: `Siglip2Model` (SigLIP2 model)
  - [SiglipConfig](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipConfig) configuration class: [SiglipModel](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipModel) (SigLIP model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a zero-shot image classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForZeroShotImageClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForZeroShotImageClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `AlignConfig` configuration class: `AlignModel` (ALIGN model) - [AltCLIPConfig](/docs/transformers/v4.57.1/ko/model_doc/altclip#transformers.AltCLIPConfig) configuration class: [AltCLIPModel](/docs/transformers/v4.57.1/ko/model_doc/altclip#transformers.AltCLIPModel) (AltCLIP model) - [Blip2Config](/docs/transformers/v4.57.1/ko/model_doc/blip-2#transformers.Blip2Config) configuration class: [Blip2ForImageTextRetrieval](/docs/transformers/v4.57.1/ko/model_doc/blip-2#transformers.Blip2ForImageTextRetrieval) (BLIP-2 model) - [BlipConfig](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.BlipConfig) configuration class: [BlipModel](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.BlipModel) (BLIP model) - [CLIPConfig](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPConfig) configuration class: [CLIPModel](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPModel) (CLIP model) - [CLIPSegConfig](/docs/transformers/v4.57.1/ko/model_doc/clipseg#transformers.CLIPSegConfig) configuration class: [CLIPSegModel](/docs/transformers/v4.57.1/ko/model_doc/clipseg#transformers.CLIPSegModel) (CLIPSeg model) - `ChineseCLIPConfig` configuration class: `ChineseCLIPModel` (Chinese-CLIP model) - `MetaClip2Config` configuration class: `MetaClip2Model` (MetaCLIP 2 model) - `Siglip2Config` configuration class: `Siglip2Model` (SigLIP2 model) - [SiglipConfig](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipConfig) configuration class: [SiglipModel](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipModel) (SigLIP model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForZeroShotImageClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a zero-shot image classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **align** -- `AlignModel` (ALIGN model)
- **altclip** -- [AltCLIPModel](/docs/transformers/v4.57.1/ko/model_doc/altclip#transformers.AltCLIPModel) (AltCLIP model)
- **blip** -- [BlipModel](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.BlipModel) (BLIP model)
- **blip-2** -- [Blip2ForImageTextRetrieval](/docs/transformers/v4.57.1/ko/model_doc/blip-2#transformers.Blip2ForImageTextRetrieval) (BLIP-2 model)
- **chinese_clip** -- `ChineseCLIPModel` (Chinese-CLIP model)
- **clip** -- [CLIPModel](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPModel) (CLIP model)
- **clipseg** -- [CLIPSegModel](/docs/transformers/v4.57.1/ko/model_doc/clipseg#transformers.CLIPSegModel) (CLIPSeg model)
- **metaclip_2** -- `MetaClip2Model` (MetaCLIP 2 model)
- **siglip** -- [SiglipModel](/docs/transformers/v4.57.1/ko/model_doc/siglip#transformers.SiglipModel) (SigLIP model)
- **siglip2** -- `Siglip2Model` (SigLIP2 model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForZeroShotImageClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForZeroShotImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForZeroShotImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForZeroShotImageClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForZeroShotImageClassification[[transformers.TFAutoModelForZeroShotImageClassification]][[transformers.TFAutoModelForZeroShotImageClassification]]

#### transformers.TFAutoModelForZeroShotImageClassification[[transformers.TFAutoModelForZeroShotImageClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L594)

This is a generic model class that will be instantiated as one of the model classes of the library (with a zero-shot image classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForZeroShotImageClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [BlipConfig](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.BlipConfig) configuration class: [TFBlipModel](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.TFBlipModel) (BLIP model)
  - [CLIPConfig](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPConfig) configuration class: [TFCLIPModel](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.TFCLIPModel) (CLIP model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a zero-shot image classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForZeroShotImageClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForZeroShotImageClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [BlipConfig](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.BlipConfig) configuration class: [TFBlipModel](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.TFBlipModel) (BLIP model) - [CLIPConfig](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.CLIPConfig) configuration class: [TFCLIPModel](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.TFCLIPModel) (CLIP model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForZeroShotImageClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a zero-shot image classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **blip** -- [TFBlipModel](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.TFBlipModel) (BLIP model)
- **clip** -- [TFCLIPModel](/docs/transformers/v4.57.1/ko/model_doc/clip#transformers.TFCLIPModel) (CLIP model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForZeroShotImageClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForZeroShotImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForZeroShotImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForZeroShotImageClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForZeroShotObjectDetection[[transformers.AutoModelForZeroShotObjectDetection]][[transformers.AutoModelForZeroShotObjectDetection]]

#### transformers.AutoModelForZeroShotObjectDetection[[transformers.AutoModelForZeroShotObjectDetection]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2135)

This is a generic model class that will be instantiated as one of the model classes of the library (with a zero-shot object detection head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForZeroShotObjectDetection.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [GroundingDinoConfig](/docs/transformers/v4.57.1/ko/model_doc/grounding-dino#transformers.GroundingDinoConfig) configuration class: [GroundingDinoForObjectDetection](/docs/transformers/v4.57.1/ko/model_doc/grounding-dino#transformers.GroundingDinoForObjectDetection) (Grounding DINO model)
  - `MMGroundingDinoConfig` configuration class: `MMGroundingDinoForObjectDetection` (MM Grounding DINO model)
  - `OmDetTurboConfig` configuration class: `OmDetTurboForObjectDetection` (OmDet-Turbo model)
  - `OwlViTConfig` configuration class: `OwlViTForObjectDetection` (OWL-ViT model)
  - `Owlv2Config` configuration class: `Owlv2ForObjectDetection` (OWLv2 model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a zero-shot object detection head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForZeroShotObjectDetection

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForZeroShotObjectDetection.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [GroundingDinoConfig](/docs/transformers/v4.57.1/ko/model_doc/grounding-dino#transformers.GroundingDinoConfig) configuration class: [GroundingDinoForObjectDetection](/docs/transformers/v4.57.1/ko/model_doc/grounding-dino#transformers.GroundingDinoForObjectDetection) (Grounding DINO model) - `MMGroundingDinoConfig` configuration class: `MMGroundingDinoForObjectDetection` (MM Grounding DINO model) - `OmDetTurboConfig` configuration class: `OmDetTurboForObjectDetection` (OmDet-Turbo model) - `OwlViTConfig` configuration class: `OwlViTForObjectDetection` (OWL-ViT model) - `Owlv2Config` configuration class: `Owlv2ForObjectDetection` (OWLv2 model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForZeroShotObjectDetection.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a zero-shot object detection head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **grounding-dino** -- [GroundingDinoForObjectDetection](/docs/transformers/v4.57.1/ko/model_doc/grounding-dino#transformers.GroundingDinoForObjectDetection) (Grounding DINO model)
- **mm-grounding-dino** -- `MMGroundingDinoForObjectDetection` (MM Grounding DINO model)
- **omdet-turbo** -- `OmDetTurboForObjectDetection` (OmDet-Turbo model)
- **owlv2** -- `Owlv2ForObjectDetection` (OWLv2 model)
- **owlvit** -- `OwlViTForObjectDetection` (OWL-ViT model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForZeroShotObjectDetection

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForZeroShotObjectDetection.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForZeroShotObjectDetection.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForZeroShotObjectDetection.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

## 오디오[[audio]]

다음 자동 클래스들은 아래의 오디오 작업에 사용할 수 있습니다.

### AutoModelForAudioClassification[[transformers.AutoModelForAudioClassification]][[transformers.AutoModelForAudioClassification]]

#### transformers.AutoModelForAudioClassification[[transformers.AutoModelForAudioClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2183)

This is a generic model class that will be instantiated as one of the model classes of the library (with a audio classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForAudioClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `ASTConfig` configuration class: `ASTForAudioClassification` (Audio Spectrogram Transformer model)
  - `Data2VecAudioConfig` configuration class: `Data2VecAudioForSequenceClassification` (Data2VecAudio model)
  - `HubertConfig` configuration class: `HubertForSequenceClassification` (Hubert model)
  - `SEWConfig` configuration class: `SEWForSequenceClassification` (SEW model)
  - `SEWDConfig` configuration class: `SEWDForSequenceClassification` (SEW-D model)
  - `UniSpeechConfig` configuration class: `UniSpeechForSequenceClassification` (UniSpeech model)
  - `UniSpeechSatConfig` configuration class: `UniSpeechSatForSequenceClassification` (UniSpeechSat model)
  - `Wav2Vec2BertConfig` configuration class: `Wav2Vec2BertForSequenceClassification` (Wav2Vec2-BERT model)
  - `Wav2Vec2Config` configuration class: `Wav2Vec2ForSequenceClassification` (Wav2Vec2 model)
  - `Wav2Vec2ConformerConfig` configuration class: `Wav2Vec2ConformerForSequenceClassification` (Wav2Vec2-Conformer model)
  - `WavLMConfig` configuration class: `WavLMForSequenceClassification` (WavLM model)
  - [WhisperConfig](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperConfig) configuration class: [WhisperForAudioClassification](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperForAudioClassification) (Whisper model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a audio classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForAudioClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForAudioClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `ASTConfig` configuration class: `ASTForAudioClassification` (Audio Spectrogram Transformer model) - `Data2VecAudioConfig` configuration class: `Data2VecAudioForSequenceClassification` (Data2VecAudio model) - `HubertConfig` configuration class: `HubertForSequenceClassification` (Hubert model) - `SEWConfig` configuration class: `SEWForSequenceClassification` (SEW model) - `SEWDConfig` configuration class: `SEWDForSequenceClassification` (SEW-D model) - `UniSpeechConfig` configuration class: `UniSpeechForSequenceClassification` (UniSpeech model) - `UniSpeechSatConfig` configuration class: `UniSpeechSatForSequenceClassification` (UniSpeechSat model) - `Wav2Vec2BertConfig` configuration class: `Wav2Vec2BertForSequenceClassification` (Wav2Vec2-BERT model) - `Wav2Vec2Config` configuration class: `Wav2Vec2ForSequenceClassification` (Wav2Vec2 model) - `Wav2Vec2ConformerConfig` configuration class: `Wav2Vec2ConformerForSequenceClassification` (Wav2Vec2-Conformer model) - `WavLMConfig` configuration class: `WavLMForSequenceClassification` (WavLM model) - [WhisperConfig](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperConfig) configuration class: [WhisperForAudioClassification](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperForAudioClassification) (Whisper model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForAudioClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a audio classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **audio-spectrogram-transformer** -- `ASTForAudioClassification` (Audio Spectrogram Transformer model)
- **data2vec-audio** -- `Data2VecAudioForSequenceClassification` (Data2VecAudio model)
- **hubert** -- `HubertForSequenceClassification` (Hubert model)
- **sew** -- `SEWForSequenceClassification` (SEW model)
- **sew-d** -- `SEWDForSequenceClassification` (SEW-D model)
- **unispeech** -- `UniSpeechForSequenceClassification` (UniSpeech model)
- **unispeech-sat** -- `UniSpeechSatForSequenceClassification` (UniSpeechSat model)
- **wav2vec2** -- `Wav2Vec2ForSequenceClassification` (Wav2Vec2 model)
- **wav2vec2-bert** -- `Wav2Vec2BertForSequenceClassification` (Wav2Vec2-BERT model)
- **wav2vec2-conformer** -- `Wav2Vec2ConformerForSequenceClassification` (Wav2Vec2-Conformer model)
- **wavlm** -- `WavLMForSequenceClassification` (WavLM model)
- **whisper** -- [WhisperForAudioClassification](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperForAudioClassification) (Whisper model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForAudioClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForAudioClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForAudioClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForAudioClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForAudioClassification[[transformers.TFAutoModelForAudioClassification]][[transformers.TFAutoModelForAudioClassification]]

#### transformers.TFAutoModelForAudioClassification[[transformers.TFAutoModelForAudioClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L545)

This is a generic model class that will be instantiated as one of the model classes of the library (with a audio classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForAudioClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `Wav2Vec2Config` configuration class: `TFWav2Vec2ForSequenceClassification` (Wav2Vec2 model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a audio classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForAudioClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForAudioClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `Wav2Vec2Config` configuration class: `TFWav2Vec2ForSequenceClassification` (Wav2Vec2 model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForAudioClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a audio classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **wav2vec2** -- `TFWav2Vec2ForSequenceClassification` (Wav2Vec2 model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForAudioClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForAudioClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForAudioClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForAudioClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForAudioFrameClassification[[transformers.AutoModelForAudioFrameClassification]][[transformers.AutoModelForAudioFrameClassification]]

#### transformers.AutoModelForAudioFrameClassification[[transformers.AutoModelForAudioFrameClassification]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2206)

This is a generic model class that will be instantiated as one of the model classes of the library (with a audio frame (token) classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForAudioFrameClassification.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `Data2VecAudioConfig` configuration class: `Data2VecAudioForAudioFrameClassification` (Data2VecAudio model)
  - `UniSpeechSatConfig` configuration class: `UniSpeechSatForAudioFrameClassification` (UniSpeechSat model)
  - `Wav2Vec2BertConfig` configuration class: `Wav2Vec2BertForAudioFrameClassification` (Wav2Vec2-BERT model)
  - `Wav2Vec2Config` configuration class: `Wav2Vec2ForAudioFrameClassification` (Wav2Vec2 model)
  - `Wav2Vec2ConformerConfig` configuration class: `Wav2Vec2ConformerForAudioFrameClassification` (Wav2Vec2-Conformer model)
  - `WavLMConfig` configuration class: `WavLMForAudioFrameClassification` (WavLM model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a audio frame (token) classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForAudioFrameClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForAudioFrameClassification.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `Data2VecAudioConfig` configuration class: `Data2VecAudioForAudioFrameClassification` (Data2VecAudio model) - `UniSpeechSatConfig` configuration class: `UniSpeechSatForAudioFrameClassification` (UniSpeechSat model) - `Wav2Vec2BertConfig` configuration class: `Wav2Vec2BertForAudioFrameClassification` (Wav2Vec2-BERT model) - `Wav2Vec2Config` configuration class: `Wav2Vec2ForAudioFrameClassification` (Wav2Vec2 model) - `Wav2Vec2ConformerConfig` configuration class: `Wav2Vec2ConformerForAudioFrameClassification` (Wav2Vec2-Conformer model) - `WavLMConfig` configuration class: `WavLMForAudioFrameClassification` (WavLM model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForAudioFrameClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a audio frame (token) classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **data2vec-audio** -- `Data2VecAudioForAudioFrameClassification` (Data2VecAudio model)
- **unispeech-sat** -- `UniSpeechSatForAudioFrameClassification` (UniSpeechSat model)
- **wav2vec2** -- `Wav2Vec2ForAudioFrameClassification` (Wav2Vec2 model)
- **wav2vec2-bert** -- `Wav2Vec2BertForAudioFrameClassification` (Wav2Vec2-BERT model)
- **wav2vec2-conformer** -- `Wav2Vec2ConformerForAudioFrameClassification` (Wav2Vec2-Conformer model)
- **wavlm** -- `WavLMForAudioFrameClassification` (WavLM model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForAudioFrameClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForAudioFrameClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForAudioFrameClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForAudioFrameClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForCTC[[transformers.AutoModelForCTC]][[transformers.AutoModelForCTC]]

#### transformers.AutoModelForCTC[[transformers.AutoModelForCTC]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2190)

This is a generic model class that will be instantiated as one of the model classes of the library (with a connectionist temporal classification head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForCTC.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `Data2VecAudioConfig` configuration class: `Data2VecAudioForCTC` (Data2VecAudio model)
  - `HubertConfig` configuration class: `HubertForCTC` (Hubert model)
  - `MCTCTConfig` configuration class: `MCTCTForCTC` (M-CTC-T model)
  - `ParakeetCTCConfig` configuration class: `ParakeetForCTC` (Parakeet model)
  - `SEWConfig` configuration class: `SEWForCTC` (SEW model)
  - `SEWDConfig` configuration class: `SEWDForCTC` (SEW-D model)
  - `UniSpeechConfig` configuration class: `UniSpeechForCTC` (UniSpeech model)
  - `UniSpeechSatConfig` configuration class: `UniSpeechSatForCTC` (UniSpeechSat model)
  - `Wav2Vec2BertConfig` configuration class: `Wav2Vec2BertForCTC` (Wav2Vec2-BERT model)
  - `Wav2Vec2Config` configuration class: `Wav2Vec2ForCTC` (Wav2Vec2 model)
  - `Wav2Vec2ConformerConfig` configuration class: `Wav2Vec2ConformerForCTC` (Wav2Vec2-Conformer model)
  - `WavLMConfig` configuration class: `WavLMForCTC` (WavLM model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a connectionist temporal classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForCTC

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForCTC.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `Data2VecAudioConfig` configuration class: `Data2VecAudioForCTC` (Data2VecAudio model) - `HubertConfig` configuration class: `HubertForCTC` (Hubert model) - `MCTCTConfig` configuration class: `MCTCTForCTC` (M-CTC-T model) - `ParakeetCTCConfig` configuration class: `ParakeetForCTC` (Parakeet model) - `SEWConfig` configuration class: `SEWForCTC` (SEW model) - `SEWDConfig` configuration class: `SEWDForCTC` (SEW-D model) - `UniSpeechConfig` configuration class: `UniSpeechForCTC` (UniSpeech model) - `UniSpeechSatConfig` configuration class: `UniSpeechSatForCTC` (UniSpeechSat model) - `Wav2Vec2BertConfig` configuration class: `Wav2Vec2BertForCTC` (Wav2Vec2-BERT model) - `Wav2Vec2Config` configuration class: `Wav2Vec2ForCTC` (Wav2Vec2 model) - `Wav2Vec2ConformerConfig` configuration class: `Wav2Vec2ConformerForCTC` (Wav2Vec2-Conformer model) - `WavLMConfig` configuration class: `WavLMForCTC` (WavLM model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForCTC.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a connectionist temporal classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **data2vec-audio** -- `Data2VecAudioForCTC` (Data2VecAudio model)
- **hubert** -- `HubertForCTC` (Hubert model)
- **mctct** -- `MCTCTForCTC` (M-CTC-T model)
- **parakeet_ctc** -- `ParakeetForCTC` (Parakeet model)
- **sew** -- `SEWForCTC` (SEW model)
- **sew-d** -- `SEWDForCTC` (SEW-D model)
- **unispeech** -- `UniSpeechForCTC` (UniSpeech model)
- **unispeech-sat** -- `UniSpeechSatForCTC` (UniSpeechSat model)
- **wav2vec2** -- `Wav2Vec2ForCTC` (Wav2Vec2 model)
- **wav2vec2-bert** -- `Wav2Vec2BertForCTC` (Wav2Vec2-BERT model)
- **wav2vec2-conformer** -- `Wav2Vec2ConformerForCTC` (Wav2Vec2-Conformer model)
- **wavlm** -- `WavLMForCTC` (WavLM model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForCTC

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForCTC.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForCTC.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForCTC.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForSpeechSeq2Seq[[transformers.AutoModelForSpeechSeq2Seq]][[transformers.AutoModelForSpeechSeq2Seq]]

#### transformers.AutoModelForSpeechSeq2Seq[[transformers.AutoModelForSpeechSeq2Seq]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2197)

This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForSpeechSeq2Seq.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `DiaConfig` configuration class: `DiaForConditionalGeneration` (Dia model)
  - `GraniteSpeechConfig` configuration class: `GraniteSpeechForConditionalGeneration` (GraniteSpeech model)
  - `KyutaiSpeechToTextConfig` configuration class: `KyutaiSpeechToTextForConditionalGeneration` (KyutaiSpeechToText model)
  - `MoonshineConfig` configuration class: `MoonshineForConditionalGeneration` (Moonshine model)
  - `Pop2PianoConfig` configuration class: `Pop2PianoForConditionalGeneration` (Pop2Piano model)
  - `SeamlessM4TConfig` configuration class: `SeamlessM4TForSpeechToText` (SeamlessM4T model)
  - `SeamlessM4Tv2Config` configuration class: `SeamlessM4Tv2ForSpeechToText` (SeamlessM4Tv2 model)
  - `Speech2TextConfig` configuration class: `Speech2TextForConditionalGeneration` (Speech2Text model)
  - `SpeechEncoderDecoderConfig` configuration class: `SpeechEncoderDecoderModel` (Speech Encoder decoder model)
  - `SpeechT5Config` configuration class: `SpeechT5ForSpeechToText` (SpeechT5 model)
  - [WhisperConfig](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperConfig) configuration class: [WhisperForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperForConditionalGeneration) (Whisper model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForSpeechSeq2Seq

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForSpeechSeq2Seq.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `DiaConfig` configuration class: `DiaForConditionalGeneration` (Dia model) - `GraniteSpeechConfig` configuration class: `GraniteSpeechForConditionalGeneration` (GraniteSpeech model) - `KyutaiSpeechToTextConfig` configuration class: `KyutaiSpeechToTextForConditionalGeneration` (KyutaiSpeechToText model) - `MoonshineConfig` configuration class: `MoonshineForConditionalGeneration` (Moonshine model) - `Pop2PianoConfig` configuration class: `Pop2PianoForConditionalGeneration` (Pop2Piano model) - `SeamlessM4TConfig` configuration class: `SeamlessM4TForSpeechToText` (SeamlessM4T model) - `SeamlessM4Tv2Config` configuration class: `SeamlessM4Tv2ForSpeechToText` (SeamlessM4Tv2 model) - `Speech2TextConfig` configuration class: `Speech2TextForConditionalGeneration` (Speech2Text model) - `SpeechEncoderDecoderConfig` configuration class: `SpeechEncoderDecoderModel` (Speech Encoder decoder model) - `SpeechT5Config` configuration class: `SpeechT5ForSpeechToText` (SpeechT5 model) - [WhisperConfig](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperConfig) configuration class: [WhisperForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperForConditionalGeneration) (Whisper model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForSpeechSeq2Seq.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **dia** -- `DiaForConditionalGeneration` (Dia model)
- **granite_speech** -- `GraniteSpeechForConditionalGeneration` (GraniteSpeech model)
- **kyutai_speech_to_text** -- `KyutaiSpeechToTextForConditionalGeneration` (KyutaiSpeechToText model)
- **moonshine** -- `MoonshineForConditionalGeneration` (Moonshine model)
- **pop2piano** -- `Pop2PianoForConditionalGeneration` (Pop2Piano model)
- **seamless_m4t** -- `SeamlessM4TForSpeechToText` (SeamlessM4T model)
- **seamless_m4t_v2** -- `SeamlessM4Tv2ForSpeechToText` (SeamlessM4Tv2 model)
- **speech-encoder-decoder** -- `SpeechEncoderDecoderModel` (Speech Encoder decoder model)
- **speech_to_text** -- `Speech2TextForConditionalGeneration` (Speech2Text model)
- **speecht5** -- `SpeechT5ForSpeechToText` (SpeechT5 model)
- **whisper** -- [WhisperForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperForConditionalGeneration) (Whisper model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForSpeechSeq2Seq

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForSpeechSeq2Seq.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForSpeechSeq2Seq[[transformers.TFAutoModelForSpeechSeq2Seq]][[transformers.TFAutoModelForSpeechSeq2Seq]]

#### transformers.TFAutoModelForSpeechSeq2Seq[[transformers.TFAutoModelForSpeechSeq2Seq]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L700)

This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForSpeechSeq2Seq.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `Speech2TextConfig` configuration class: `TFSpeech2TextForConditionalGeneration` (Speech2Text model)
  - [WhisperConfig](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperConfig) configuration class: [TFWhisperForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.TFWhisperForConditionalGeneration) (Whisper model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForSpeechSeq2Seq

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForSpeechSeq2Seq.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `Speech2TextConfig` configuration class: `TFSpeech2TextForConditionalGeneration` (Speech2Text model) - [WhisperConfig](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperConfig) configuration class: [TFWhisperForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.TFWhisperForConditionalGeneration) (Whisper model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForSpeechSeq2Seq.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **speech_to_text** -- `TFSpeech2TextForConditionalGeneration` (Speech2Text model)
- **whisper** -- [TFWhisperForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.TFWhisperForConditionalGeneration) (Whisper model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForSpeechSeq2Seq

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForSpeechSeq2Seq.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### FlaxAutoModelForSpeechSeq2Seq[[transformers.FlaxAutoModelForSpeechSeq2Seq]][[transformers.FlaxAutoModelForSpeechSeq2Seq]]

#### transformers.FlaxAutoModelForSpeechSeq2Seq[[transformers.FlaxAutoModelForSpeechSeq2Seq]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_flax_auto.py#L377)

This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.FlaxAutoModelForSpeechSeq2Seq.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `SpeechEncoderDecoderConfig` configuration class: `FlaxSpeechEncoderDecoderModel` (Speech Encoder decoder model)
  - [WhisperConfig](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperConfig) configuration class: [FlaxWhisperForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.FlaxWhisperForConditionalGeneration) (Whisper model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForSpeechSeq2Seq

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForSpeechSeq2Seq.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `SpeechEncoderDecoderConfig` configuration class: `FlaxSpeechEncoderDecoderModel` (Speech Encoder decoder model) - [WhisperConfig](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.WhisperConfig) configuration class: [FlaxWhisperForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.FlaxWhisperForConditionalGeneration) (Whisper model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.FlaxAutoModelForSpeechSeq2Seq.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **speech-encoder-decoder** -- `FlaxSpeechEncoderDecoderModel` (Speech Encoder decoder model)
- **whisper** -- [FlaxWhisperForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/whisper#transformers.FlaxWhisperForConditionalGeneration) (Whisper model)

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForSpeechSeq2Seq

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForSpeechSeq2Seq.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForAudioXVector[[transformers.AutoModelForAudioXVector]][[transformers.AutoModelForAudioXVector]]

#### transformers.AutoModelForAudioXVector[[transformers.AutoModelForAudioXVector]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2215)

This is a generic model class that will be instantiated as one of the model classes of the library (with a audio retrieval via x-vector head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForAudioXVector.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `Data2VecAudioConfig` configuration class: `Data2VecAudioForXVector` (Data2VecAudio model)
  - `UniSpeechSatConfig` configuration class: `UniSpeechSatForXVector` (UniSpeechSat model)
  - `Wav2Vec2BertConfig` configuration class: `Wav2Vec2BertForXVector` (Wav2Vec2-BERT model)
  - `Wav2Vec2Config` configuration class: `Wav2Vec2ForXVector` (Wav2Vec2 model)
  - `Wav2Vec2ConformerConfig` configuration class: `Wav2Vec2ConformerForXVector` (Wav2Vec2-Conformer model)
  - `WavLMConfig` configuration class: `WavLMForXVector` (WavLM model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a audio retrieval via x-vector head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForAudioXVector

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForAudioXVector.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `Data2VecAudioConfig` configuration class: `Data2VecAudioForXVector` (Data2VecAudio model) - `UniSpeechSatConfig` configuration class: `UniSpeechSatForXVector` (UniSpeechSat model) - `Wav2Vec2BertConfig` configuration class: `Wav2Vec2BertForXVector` (Wav2Vec2-BERT model) - `Wav2Vec2Config` configuration class: `Wav2Vec2ForXVector` (Wav2Vec2 model) - `Wav2Vec2ConformerConfig` configuration class: `Wav2Vec2ConformerForXVector` (Wav2Vec2-Conformer model) - `WavLMConfig` configuration class: `WavLMForXVector` (WavLM model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForAudioXVector.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a audio retrieval via x-vector head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **data2vec-audio** -- `Data2VecAudioForXVector` (Data2VecAudio model)
- **unispeech-sat** -- `UniSpeechSatForXVector` (UniSpeechSat model)
- **wav2vec2** -- `Wav2Vec2ForXVector` (Wav2Vec2 model)
- **wav2vec2-bert** -- `Wav2Vec2BertForXVector` (Wav2Vec2-BERT model)
- **wav2vec2-conformer** -- `Wav2Vec2ConformerForXVector` (Wav2Vec2-Conformer model)
- **wavlm** -- `WavLMForXVector` (WavLM model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForAudioXVector

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForAudioXVector.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForAudioXVector.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForAudioXVector.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForTextToSpectrogram[[transformers.AutoModelForTextToSpectrogram]][[transformers.AutoModelForTextToSpectrogram]]

#### transformers.AutoModelForTextToSpectrogram[[transformers.AutoModelForTextToSpectrogram]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2219)

### AutoModelForTextToWaveform[[transformers.AutoModelForTextToWaveform]][[transformers.AutoModelForTextToWaveform]]

#### transformers.AutoModelForTextToWaveform[[transformers.AutoModelForTextToWaveform]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2223)

## 멀티모달[[multimodal]]

다음 자동 클래스들은 아래의 멀티모달 작업에 사용할 수 있습니다.

### AutoModelForTableQuestionAnswering[[transformers.AutoModelForTableQuestionAnswering]][[transformers.AutoModelForTableQuestionAnswering]]

#### transformers.AutoModelForTableQuestionAnswering[[transformers.AutoModelForTableQuestionAnswering]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2013)

This is a generic model class that will be instantiated as one of the model classes of the library (with a table question answering head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForTableQuestionAnswering.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `TapasConfig` configuration class: `TapasForQuestionAnswering` (TAPAS model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a table question answering head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForTableQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google/tapas-base-finetuned-wtq")
>>> model = AutoModelForTableQuestionAnswering.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `TapasConfig` configuration class: `TapasForQuestionAnswering` (TAPAS model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForTableQuestionAnswering.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a table question answering head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **tapas** -- `TapasForQuestionAnswering` (TAPAS model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForTableQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq")

>>> # Update configuration during loading
>>> model = AutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/tapas_tf_model_config.json")
>>> model = AutoModelForTableQuestionAnswering.from_pretrained(
...     "./tf_model/tapas_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForTableQuestionAnswering[[transformers.TFAutoModelForTableQuestionAnswering]][[transformers.TFAutoModelForTableQuestionAnswering]]

#### transformers.TFAutoModelForTableQuestionAnswering[[transformers.TFAutoModelForTableQuestionAnswering]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L664)

This is a generic model class that will be instantiated as one of the model classes of the library (with a table question answering head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForTableQuestionAnswering.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `TapasConfig` configuration class: `TFTapasForQuestionAnswering` (TAPAS model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a table question answering head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForTableQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google/tapas-base-finetuned-wtq")
>>> model = TFAutoModelForTableQuestionAnswering.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `TapasConfig` configuration class: `TFTapasForQuestionAnswering` (TAPAS model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForTableQuestionAnswering.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a table question answering head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **tapas** -- `TFTapasForQuestionAnswering` (TAPAS model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForTableQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq")

>>> # Update configuration during loading
>>> model = TFAutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/tapas_pt_model_config.json")
>>> model = TFAutoModelForTableQuestionAnswering.from_pretrained(
...     "./pt_model/tapas_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForDocumentQuestionAnswering[[transformers.AutoModelForDocumentQuestionAnswering]][[transformers.AutoModelForDocumentQuestionAnswering]]

#### transformers.AutoModelForDocumentQuestionAnswering[[transformers.AutoModelForDocumentQuestionAnswering]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2035)

This is a generic model class that will be instantiated as one of the model classes of the library (with a document question answering head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForDocumentQuestionAnswering.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `LayoutLMConfig` configuration class: `LayoutLMForQuestionAnswering` (LayoutLM model)
  - `LayoutLMv2Config` configuration class: `LayoutLMv2ForQuestionAnswering` (LayoutLMv2 model)
  - `LayoutLMv3Config` configuration class: `LayoutLMv3ForQuestionAnswering` (LayoutLMv3 model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a document question answering head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForDocumentQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3")
>>> model = AutoModelForDocumentQuestionAnswering.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `LayoutLMConfig` configuration class: `LayoutLMForQuestionAnswering` (LayoutLM model) - `LayoutLMv2Config` configuration class: `LayoutLMv2ForQuestionAnswering` (LayoutLMv2 model) - `LayoutLMv3Config` configuration class: `LayoutLMv3ForQuestionAnswering` (LayoutLMv3 model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForDocumentQuestionAnswering.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a document question answering head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **layoutlm** -- `LayoutLMForQuestionAnswering` (LayoutLM model)
- **layoutlmv2** -- `LayoutLMv2ForQuestionAnswering` (LayoutLMv2 model)
- **layoutlmv3** -- `LayoutLMv3ForQuestionAnswering` (LayoutLMv3 model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForDocumentQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3")

>>> # Update configuration during loading
>>> model = AutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/layoutlm_tf_model_config.json")
>>> model = AutoModelForDocumentQuestionAnswering.from_pretrained(
...     "./tf_model/layoutlm_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### TFAutoModelForDocumentQuestionAnswering[[transformers.TFAutoModelForDocumentQuestionAnswering]][[transformers.TFAutoModelForDocumentQuestionAnswering]]

#### transformers.TFAutoModelForDocumentQuestionAnswering[[transformers.TFAutoModelForDocumentQuestionAnswering]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L653)

This is a generic model class that will be instantiated as one of the model classes of the library (with a document question answering head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForDocumentQuestionAnswering.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `LayoutLMConfig` configuration class: `TFLayoutLMForQuestionAnswering` (LayoutLM model)
  - `LayoutLMv3Config` configuration class: `TFLayoutLMv3ForQuestionAnswering` (LayoutLMv3 model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a document question answering head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForDocumentQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3")
>>> model = TFAutoModelForDocumentQuestionAnswering.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `LayoutLMConfig` configuration class: `TFLayoutLMForQuestionAnswering` (LayoutLM model) - `LayoutLMv3Config` configuration class: `TFLayoutLMv3ForQuestionAnswering` (LayoutLMv3 model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForDocumentQuestionAnswering.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a document question answering head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **layoutlm** -- `TFLayoutLMForQuestionAnswering` (LayoutLM model)
- **layoutlmv3** -- `TFLayoutLMv3ForQuestionAnswering` (LayoutLMv3 model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForDocumentQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3")

>>> # Update configuration during loading
>>> model = TFAutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/layoutlm_pt_model_config.json")
>>> model = TFAutoModelForDocumentQuestionAnswering.from_pretrained(
...     "./pt_model/layoutlm_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForVisualQuestionAnswering[[transformers.AutoModelForVisualQuestionAnswering]][[transformers.AutoModelForVisualQuestionAnswering]]

#### transformers.AutoModelForVisualQuestionAnswering[[transformers.AutoModelForVisualQuestionAnswering]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2024)

This is a generic model class that will be instantiated as one of the model classes of the library (with a visual question answering head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForVisualQuestionAnswering.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [Blip2Config](/docs/transformers/v4.57.1/ko/model_doc/blip-2#transformers.Blip2Config) configuration class: [Blip2ForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/blip-2#transformers.Blip2ForConditionalGeneration) (BLIP-2 model)
  - [BlipConfig](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.BlipConfig) configuration class: [BlipForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.BlipForQuestionAnswering) (BLIP model)
  - `ViltConfig` configuration class: `ViltForQuestionAnswering` (ViLT model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a visual question answering head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForVisualQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("dandelin/vilt-b32-finetuned-vqa")
>>> model = AutoModelForVisualQuestionAnswering.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [Blip2Config](/docs/transformers/v4.57.1/ko/model_doc/blip-2#transformers.Blip2Config) configuration class: [Blip2ForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/blip-2#transformers.Blip2ForConditionalGeneration) (BLIP-2 model) - [BlipConfig](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.BlipConfig) configuration class: [BlipForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.BlipForQuestionAnswering) (BLIP model) - `ViltConfig` configuration class: `ViltForQuestionAnswering` (ViLT model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForVisualQuestionAnswering.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a visual question answering head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **blip** -- [BlipForQuestionAnswering](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.BlipForQuestionAnswering) (BLIP model)
- **blip-2** -- [Blip2ForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/blip-2#transformers.Blip2ForConditionalGeneration) (BLIP-2 model)
- **vilt** -- `ViltForQuestionAnswering` (ViLT model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForVisualQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForVisualQuestionAnswering.from_pretrained("dandelin/vilt-b32-finetuned-vqa")

>>> # Update configuration during loading
>>> model = AutoModelForVisualQuestionAnswering.from_pretrained("dandelin/vilt-b32-finetuned-vqa", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/vilt_tf_model_config.json")
>>> model = AutoModelForVisualQuestionAnswering.from_pretrained(
...     "./tf_model/vilt_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForVision2Seq[[transformers.AutoModelForVision2Seq]][[transformers.AutoModelForVision2Seq]]

#### transformers.AutoModelForVision2Seq[[transformers.AutoModelForVision2Seq]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2272)

### TFAutoModelForVision2Seq[[transformers.TFAutoModelForVision2Seq]][[transformers.TFAutoModelForVision2Seq]]

#### transformers.TFAutoModelForVision2Seq[[transformers.TFAutoModelForVision2Seq]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_tf_auto.py#L612)

This is a generic model class that will be instantiated as one of the model classes of the library (with a vision-to-text modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.TFAutoModelForVision2Seq.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [BlipConfig](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.BlipConfig) configuration class: [TFBlipForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.TFBlipForConditionalGeneration) (BLIP model)
  - `VisionEncoderDecoderConfig` configuration class: `TFVisionEncoderDecoderModel` (Vision Encoder decoder model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a vision-to-text modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForVision2Seq

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForVision2Seq.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [BlipConfig](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.BlipConfig) configuration class: [TFBlipForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.TFBlipForConditionalGeneration) (BLIP model) - `VisionEncoderDecoderConfig` configuration class: `TFVisionEncoderDecoderModel` (Vision Encoder decoder model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.TFAutoModelForVision2Seq.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a vision-to-text modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **blip** -- [TFBlipForConditionalGeneration](/docs/transformers/v4.57.1/ko/model_doc/blip#transformers.TFBlipForConditionalGeneration) (BLIP model)
- **vision-encoder-decoder** -- `TFVisionEncoderDecoderModel` (Vision Encoder decoder model)

Examples:

```python
>>> from transformers import AutoConfig, TFAutoModelForVision2Seq

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForVision2Seq.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForVision2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForVision2Seq.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### FlaxAutoModelForVision2Seq[[transformers.FlaxAutoModelForVision2Seq]][[transformers.FlaxAutoModelForVision2Seq]]

#### transformers.FlaxAutoModelForVision2Seq[[transformers.FlaxAutoModelForVision2Seq]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_flax_auto.py#L370)

This is a generic model class that will be instantiated as one of the model classes of the library (with a vision-to-text modeling head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.FlaxAutoModelForVision2Seq.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `VisionEncoderDecoderConfig` configuration class: `FlaxVisionEncoderDecoderModel` (Vision Encoder decoder model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a vision-to-text modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForVision2Seq

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForVision2Seq.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `VisionEncoderDecoderConfig` configuration class: `FlaxVisionEncoderDecoderModel` (Vision Encoder decoder model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.FlaxAutoModelForVision2Seq.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a vision-to-text modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **vision-encoder-decoder** -- `FlaxVisionEncoderDecoderModel` (Vision Encoder decoder model)

Examples:

```python
>>> from transformers import AutoConfig, FlaxAutoModelForVision2Seq

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForVision2Seq.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForVision2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForVision2Seq.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *PyTorch state_dict save file* (e.g, `./pt_model/pytorch_model.bin`). In this case, `from_pt` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_pt (`bool`, *optional*, defaults to `False`) : Load the model weights from a PyTorch checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

## Time Series

### AutoModelForTimeSeriesPrediction[[transformers.AutoModelForTimeSeriesPrediction]][[transformers.AutoModelForTimeSeriesPrediction]]

#### transformers.AutoModelForTimeSeriesPrediction[[transformers.AutoModelForTimeSeriesPrediction]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/modeling_auto.py#L2101)

This is a generic model class that will be instantiated as one of the model classes of the library (with a time-series prediction head) when created
with the [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForTimeSeriesPrediction.from_confighttps://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L424[{"name": "**kwargs", "val": ""}]- **config** ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - `TimesFmConfig` configuration class: `TimesFmModelForPrediction` (TimesFm model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a time-series prediction head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v4.57.1/ko/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForTimeSeriesPrediction

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForTimeSeriesPrediction.from_config(config)
```

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - `TimesFmConfig` configuration class: `TimesFmModelForPrediction` (TimesFm model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForTimeSeriesPrediction.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/auto/auto_factory.py#L468)

Instantiate one of the model classes of the library (with a time-series prediction head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **timesfm** -- `TimesFmModelForPrediction` (TimesFm model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForTimeSeriesPrediction

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForTimeSeriesPrediction.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForTimeSeriesPrediction.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForTimeSeriesPrediction.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`. - A path or url to a *tensorflow index checkpoint file* (e.g, `./tf_model/model.ckpt.index`). In this case, `from_tf` should be set to `True` and a configuration object should be provided as `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

from_tf (`bool`, *optional*, defaults to `False`) : Load the model weights from a TensorFlow checkpoint save file (see docstring of `pretrained_model_name_or_path` argument).

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

resume_download : Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

