The SageMakerEndpoint class is used to interact with SageMaker Inference Endpoint models. It uses the AWS client for authentication, which automatically loads credentials. If a specific credential profile is to be used, the name of the profile from the ~/.aws/credentials file must be passed. The credentials or roles used should have the required policies to access the SageMaker endpoint.

Hierarchy

Constructors

Properties

CallOptions: BaseLLMCallOptions
ParsedCallOptions: Omit<BaseLLMCallOptions, never>
caller: AsyncCaller

The async caller should be used by subclasses to make any async calls, which will thus benefit from the concurrency and retry logic.

client: SageMakerRuntimeClient
endpointName: string
streaming: boolean
verbose: boolean

Whether to print out response text.

callbacks?: Callbacks
endpointKwargs?: Record<string, unknown>
metadata?: Record<string, unknown>
modelKwargs?: Record<string, unknown>
tags?: string[]
lc_runnable: boolean = true

Accessors

Methods

  • This method takes an input and options, and returns a string. It converts the input to a prompt value and generates a result based on the prompt.

    Parameters

    Returns Promise<string>

    A string result based on the prompt.

  • This method is similar to call, but it's used for making predictions based on the input text.

    Parameters

    • text: string

      Input text for the prediction.

    • Optional options: string[] | BaseLLMCallOptions

      Options for the LLM call.

    • Optional callbacks: Callbacks

      Callbacks for the LLM call.

    Returns Promise<string>

    A prediction based on the input text.

  • Stream all output from a runnable, as reported to the callback system. This includes all inner runs of LLMs, Retrievers, Tools, etc. Output is streamed as Log objects, which include a list of jsonpatch ops that describe how the state of the run has changed in each step, and the final state of the run. The jsonpatch ops can be applied in order to construct state.

    Parameters

    Returns AsyncGenerator<RunLogPatch, any, unknown>

  • Default implementation of transform, which buffers input and then calls stream. Subclasses should override this method if they can start producing output while input is still being generated.

    Parameters

    Returns AsyncGenerator<string, any, unknown>

  • Helper method to transform an Iterator of Input values into an Iterator of Output values, with callbacks. Use this to implement stream() or transform() in Runnable subclasses.

    Type Parameters

    Parameters

    • inputGenerator: AsyncGenerator<I, any, unknown>
    • transformer: ((generator, runManager?, options?) => AsyncGenerator<O, any, unknown>)
        • (generator, runManager?, options?): AsyncGenerator<O, any, unknown>
        • Parameters

          Returns AsyncGenerator<O, any, unknown>

    • Optional options: BaseLLMCallOptions & {
          runType?: string;
      }

    Returns AsyncGenerator<O, any, unknown>

Generated using TypeDoc