Models

CPM-1

class bminf.models.CPM1(device_idx: Optional[int] = None, dynamic_memory: int = 536870912, memory_limit: Optional[int] = None, version: Optional[str] = None)[source]
__init__(device_idx: Optional[int] = None, dynamic_memory: int = 536870912, memory_limit: Optional[int] = None, version: Optional[str] = None) None[source]
generate(input_sentence: str, max_tokens: int = 128, top_n: Optional[int] = None, top_p: Optional[float] = None, temperature: float = 0.9, frequency_penalty: float = 0, presence_penalty: float = 0, stop_tokens: Optional[List[str]] = None)[source]

Generate some words from the model.

Parameters
  • input_sentence – Your input.

  • max_tokens – Maximum number of tokens to generate.

  • top_n – Only sampling from top n tokens in the result.

  • top_p – Only sampling from tokens that comprising the top p probability in the result.

  • temperature – Temperature for sampling. Higher values mean more diverse results.

  • frequency_penalty – A penalty used to avoid models generating the same content.

  • presence_penalty – A penalty used to avoid models generating the same topic.

  • stop_tokens – A list of tokens that will stop the generation.

Returns

The result sentence and a boolean indicating whether stop_tokens has been generated.

CPM-2

class bminf.models.CPM2(device_idx: Optional[int] = None, dynamic_memory: int = 536870912, memory_limit: Optional[int] = None, version: Optional[str] = None)[source]
__init__(device_idx: Optional[int] = None, dynamic_memory: int = 536870912, memory_limit: Optional[int] = None, version: Optional[str] = None) None[source]
fill_blank(input_sentence: str, spans_position: Optional[List[int]] = None, max_tokens: int = 128, top_n: Optional[int] = None, top_p: Optional[float] = None, temperature: float = 0.9, frequency_penalty: float = 0, presence_penalty: float = 0)[source]

Generate spans from input sentence.

Parameters
  • input_sentence – Input sentence with “<span>” tokens.

  • spans_position – List of span positions. If None, the positions of span are automatically detected.

  • max_tokens – Maximum number of tokens to generate.

  • top_n – Only sampling from top n tokens in the result.

  • top_p – Only sampling from tokens that comprising the top p probability in the result.

  • temperature – Temperature for sampling. Higher values mean more diverse results.

  • frequency_penalty – A penalty used to avoid models generating the same content.

  • presence_penalty – A penalty used to avoid models generating the same topic.

Returns

A list of generated spans, including positions and contents.

generate(input_sentence: str, max_tokens: int = 128, top_n: Optional[int] = None, top_p: Optional[float] = None, temperature: float = 0.9, frequency_penalty: float = 0, presence_penalty: float = 0, stop_tokens: Optional[List[str]] = None) Tuple[str, bool][source]

Generate some words from the model.

Parameters
  • input_sentence – Your input.

  • max_tokens – Maximum number of tokens to generate.

  • top_n – Only sampling from top n tokens in the result.

  • top_p – Only sampling from tokens that comprising the top p probability in the result.

  • temperature – Temperature for sampling. Higher values mean more diverse results.

  • frequency_penalty – A penalty used to avoid models generating the same content.

  • presence_penalty – A penalty used to avoid models generating the same topic.

  • stop_tokens – A list of tokens that will stop the generation.

Returns

The result sentence and a boolean indicating whether stop_tokens has been generated.

EVA

class bminf.models.EVA(device_idx: Optional[int] = None, dynamic_memory: int = 536870912, memory_limit: Optional[int] = None, version: Optional[str] = None)[source]
__init__(device_idx: Optional[int] = None, dynamic_memory: int = 536870912, memory_limit: Optional[int] = None, version: Optional[str] = None) None[source]
dialogue(context: List[str], max_tokens: int = 128, top_n: Optional[int] = 10, top_p: Optional[float] = None, temperature: float = 0.85, frequency_penalty: float = 0, presence_penalty: float = 0, truncation_length: Optional[int] = 256) Tuple[str, bool][source]

Generate dialogue based on context.

Parameters
  • context – Context of the dialogue.

  • max_tokens – Maximum tokens to generate.

  • top_n – Only sampling from top n tokens in the result.

  • top_p – Only sampling from tokens that comprising the top p probability in the result.

  • temperature – Temperature for sampling. Higher values mean more diverse results.

  • frequency_penalty – A penalty used to avoid models generating the same content.

  • presence_penalty – A penalty used to avoid models generating the same topic.

Returns

A response generated by the model.