torchtable.operator module¶
Module contents¶
-
class
torchtable.operator.
Operator
¶ Bases:
object
Base class for all operators. Operators can be chained together by piping their outputs to new operators or hooking operators to other operators. Any number of operators can be chained to become a pipeline, which is itself just another operator. Subclasses should implement the apply method that defines the operation performed by the operator.
Example
>>> class TimesThree(Operator): ... def apply(self, x): ... return x * 3 >>> op = TimeThree() >>> op(4) # 4 * 3 = 12 ... 12
>>> class Square(Operator): ... def apply(self, x): return x ** 2 >>> op = TimesThree() > Square() >>> op(2) # (2 * 3) ** 2 = 36 ... 36
-
apply
(x: Any, train=True) → Any¶ Takes output of previous stage in the pipeline and produces output. Override in subclasses.
Parameters: - train – If true, this operator will “train” on the input.
- other words, the internal parameters of this operator may change to fit the given input. (In) –
-
hook
(op: torchtable.operator.core.Operator) → torchtable.operator.core.Operator¶ Connect an operator to the beginning of this pipeline. Returns self.
-
pipe
(op: torchtable.operator.core.Operator) → torchtable.operator.core.Operator¶ Connect an operator after this operator. Returns the connected operator.
-
-
class
torchtable.operator.
LambdaOperator
(func: Callable[T, T])¶ Bases:
torchtable.operator.core.Operator
Generic operator for stateless operation.
Parameters: func – Function to apply to input. -
apply
(x: T, train=True) → Any¶ Takes output of previous stage in the pipeline and produces output. Override in subclasses.
Parameters: - train – If true, this operator will “train” on the input.
- other words, the internal parameters of this operator may change to fit the given input. (In) –
-
-
class
torchtable.operator.
TransformerOperator
(transformer)¶ Bases:
torchtable.operator.core.Operator
Wrapper for any stateful transformer with fit and transform methods.
Parameters: transformer – Any object with a fit and transform method. Example
>>> op = TransformerOperator(sklearn.preprocessing.StandardScaler())
-
apply
(x: Any, train=True)¶ Takes output of previous stage in the pipeline and produces output. Override in subclasses.
Parameters: - train – If true, this operator will “train” on the input.
- other words, the internal parameters of this operator may change to fit the given input. (In) –
-
build
(x: Any) → None¶
-
-
class
torchtable.operator.
Normalize
(method: Optional[str])¶ Bases:
torchtable.operator.core.TransformerOperator
Normalizes a numeric field.
Parameters: - method – Method of normalization (choose from the following):
- None (-) – No normalization will be applied (same as noop)
- 'Gaussian' (-) – Subtracts mean and divides by the standard deviation
- 'RankGaussian' (-) – Assigns elements to a Gaussian distribution based on their rank.
-
class
torchtable.operator.
FillMissing
(method: Union[Callable, str])¶ Bases:
torchtable.operator.core.TransformerOperator
Fills missing values according to method
Parameters: - method – Method of filling missing values. Options:
- None (-) – Do not fill missing values
- 'median' (-) – Fill with median
- 'mean' (-) – Fill with mean
- 'mode' (-) – Fill with mode. Effective for categorical fields.
- - (any callable) – The output of the callable will be used to fill the missing values
-
class
torchtable.operator.
Vocab
(min_freq=0, max_features=None, handle_unk: Optional[bool] = False, nan_as_unk=False)¶ Bases:
object
Mapping from category to integer id
-
fit
(x: pandas.core.series.Series) → torchtable.operator.core.Vocab¶ Construct the mapping
-
transform
(x: pandas.core.series.Series) → pandas.core.series.Series¶
-
-
class
torchtable.operator.
Categorize
(min_freq: int = 0, max_features: Optional[int] = None, handle_unk: Optional[bool] = None)¶ Bases:
torchtable.operator.core.TransformerOperator
Converts categorical data into integer ids
Parameters: - min_freq – Minimum frequency required for a category to receive a unique id. Any categories with a lower frequency will be treated as unknown categories.
- max_features – Maximum number of unique categories to store. If larger than the number of actual categories, the categories with the highest frequencies will be chosen. If None, there will be no limit on the number of categories.
- handle_unk – Whether to allocate a unique id to unknown categories. If you expect to see categories that you did not encounter in your training data, you should set this to True. If None, handle_unk will be set to True if min_freq > 0 or max_features is not None, otherwise it will be False.
-
vocab_size
¶
-
class
torchtable.operator.
ToTensor
(dtype: torch.dtype)¶ Bases:
torchtable.operator.core.Operator
Convert input to a torch.tensor
Parameters: dtype – The dtype of the output tensor -
apply
(x: Union[pandas.core.series.Series, numpy.core.multiarray.array], device: Optional[torch.device] = None, train=True) → None._VariableFunctions.tensor¶ Takes output of previous stage in the pipeline and produces output. Override in subclasses.
Parameters: - train – If true, this operator will “train” on the input.
- other words, the internal parameters of this operator may change to fit the given input. (In) –
-
-
exception
torchtable.operator.
UnknownCategoryError
¶ Bases:
ValueError