Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parameter transformations inside ParamSet #215

Open
mb706 opened this issue Feb 8, 2019 · 2 comments · May be fixed by #225
Open

Parameter transformations inside ParamSet #215

mb706 opened this issue Feb 8, 2019 · 2 comments · May be fixed by #225

Comments

@mb706
Copy link
Contributor

mb706 commented Feb 8, 2019

Sometimes it would be useful to specify how parameters are changed inside a Learner / PipeOp, e.g. as in mlr-org/mlr3pipelines#24. A typical example is the mtry parameter of a random forest, which should range from 1 to task$ncol. It would be nice if one could introduce an mtry.pexp parameter ranging from 0 to 1, so that the actual mtry is set to round(task$ncol ^ mtry.pexp).

The $trafo function, as it currently stands, is not a good fit for this, because it (1) operates before the Learner even sees the Task, so wouldn't know about task$ncol, and (2) would not be able to introduce a new parameter mtry.pexp, it would only be able to re-scale the present mtry, which is an integer between 1 and Inf, not a real number between 0 and 1.

I think the following UI would be quite nice:

lrn = mlr_learners$get("classif.ranger")
ps = lrn$param_set$clone()
ps$subset(setdiff(ps$ids(), "mtry"))
ps$add(ParamDbl$new("mtry.pexp", 0, 1))
ps$trafo = function(x, env, param_set) {
  x$mtry = round(env$task$ncol ^ x$mtry.pexp)
  x$mtry.pexp = NULL
  x
}

lrn$param_set$add_interface(ps)  # !!

# set effective `mtry` to `round(ncol(task) ^ 0.7)` when training happens
lrn$param_set$values$mtry.pexp = 0.7

lrn$param_set$values$mtry = 3 # ERROR

This would change the lrn$param_set to "look and feel" like the ps constructed / modified before, but internally the Learner (or e.g. a PipeOp) would get the parameter values as performed by the $trafo function.

A way to implement this would be the following:

  1. Add a private$.learnerside = NULL slot that points to the ParamSet that the Learner / PipeOp should see.
  2. Add a $has_interface active binding:
    has_interface = function() !is.null(private$.learnerside)
  3. Add a self$learnerside(last = TRUE) function that gives the ParamSet that the Learner / PipeOp should see. Because private$.learnerside could point to a ParamSet that itself has a private$.learnerside set, it should be recursive if last is TRUE, and only give the "next" learnerside if last is FALSE.
    learnerside = function(last = TRUE) {
      if (!self$has_interface)
        return(self)
      if (last) {
        private$.learnerside$learnerside(last = TRUE)
      } else {
        private$.learnerside
      }
    }
  4. Implement a private$copy_param_set() helper function. It copies all relevant items from its argument to the ParamSet itself, to turn the self into an effective copy of that argument:
    copy_param_set = function(param_set) {
      private$.params = param_set$params
      private$.deps = param_set$deps
      private$.values = param_set$values
      private$.trafo = param_set$trafo
      invisible(self)
    }
  5. Implement the public $add_interface() function:
    add_interface = function(param_set) {
      private$.learnerside = self$clone(deep = TRUE)
      private$copy_param_set(param_set)
    }
  6. Implement a public $remove_interface() function:
    remove_interface = function(param_set, all = FALSE) {
      if (!self$has_interface)
        stop("no interface to remove")
      replace_with = self$learnerside(last = all)
      private$copy_param_set(replace_with)
      private$.learnerside = replace_with$.learnerside
    }
  7. How does the Learner / PipeOp get its value out of this? There probably should be a $get_values() function that gets the values for the operation, which should also have the filter functionality that ids currently has.
    get_values = function(class = NULL, tags = NULL, learnerside = FALSE, env) {
      if (learnerside && self$has_interface) {
        private$.learnerside$values = self$trafo(self$values, env)
        return(private$.learnerside$get_values(
          class = class, tags = tags, learnerside = learnerside, env = env
        ))
      }
      values = self$values
      values[intersect(names(values), self$ids(class = class, tags = tags))]
    }
  8. Change the trafo active binding to also accept functions of the form function(x, env)

This implementation has the advantage that multiple interfaces can be "stacked" on top of each other: A user who gets a Learner does not need to know or care if something put an interface in front of its ParamSet. When the user sets a parameter using param_set$values$param = x, the value gets checked against the constraints of the interface parameter set. When he calls lrn$train(), the train() function calls get_values(tags = "train", learnerside = TRUE, env = list(task = task)), which recurses through the different interfaces that were added, and sets $values in each one of them after transforming. This automatically checks that the trafo function returns a feasible value for the original ParamSet.

This change would also be completely transparent to everything ParamSet is doing so far.

Things that I am not sure about:

  • It is a bit inelegant to have the env parameter depend on what kind of object the ParamSet belongs to: Some PipeOps (e.g. PipeOpModelAvg) have parameters in a different context, where no task is present (and instead maybe a prediction). One would probably want to agree on an interface (always task in a Learner / preprocessing PipeOp, always prediction in a "post-processing" PipeOp, other contexts..?)
  • There are no checks on the feasibility of the trafo function output until the actual training / predicting happens.
  • Maybe one still wants to use the "train" / "predict" tags from the outside, e.g. maybe a tuning algorithm wants to train a model with one set of "train" parameters and then evaluate these with different "predict" parameters to get multiple performance datapoints with only a single train() call for efficiency. In that case it would be nice if the trafo could also respect the "train" / "predict" tags and work when only a subset of parameter values is present. In that case, the get_values would need to be adapted to only give self$values[intersect(names(self$values), set$ids(...tags = tags))] to self$trafo.
  • I don't know if it would be useful to do this for ParamSetCollection. Maybe a GraphLearner would want to have an interface as well? I wouldn't know what the UI for that would look like, however. In that case it would probably be easiest to intervene with the individual PipeOps' ParamSet.
@jakob-r
Copy link
Sponsor Member

jakob-r commented Feb 20, 2019

Hmm. I wanted to find the discussion we already had with @berndbischl about this topic. Maybe someone can link it if he finds it. We also came to the conclusion that we want to allow to add a parameter like mtry.pexp and then use the trafo.
I think one major issue is, that a fixed specification on when the transformation is carried out would help. I just wonder because, where would the trafo be operated when I don't know the task (e.g. task$ncol)?

@mb706
Copy link
Contributor Author

mb706 commented Feb 20, 2019

Maybe you mean the discussion in mlr-org/mlr3pipelines#24 and Bernd's comment in particular?
I think that has two problems:

  1. The way the transformation currently works is that trafo is called in Design$transpose. The transpose function does not know the task and does not even know on what learner / other object the resulting parameter settings are going to be used on.
  2. Sometimes the parameter transformations we want are dependent on the task at a certain point in the pipeline. (I'm thinking about this in the context of parameters in mlr3pipelines or when using wrappers). This has been a problem in mlr with preprocessing: If a feature filter removes an unknown number of features, then a transformation that should depend on the number of features would have to happen inside the Learner. The solution for this so far has been to re-implement the Learner with the desired transformation written in the new custom learner code, which is very inconvenient.

If we have to attach the parameter transformation with a certain PipeOp / Learner step inside a Graph (or TuneWrapper, when they exist), then I think it makes sense to put the transformation inside the ParamSet of these objects. Using the interface / learnerside semantics above is my suggestion on how to do that in a nice way: The transformation is added, but also the transformation is transparent to the user.

I would say the $trafo that is currently present is very useful and should stay: it transforms between sampling distributions and can add parameter values that could not be sampled (e.g. functions, or actual vectors). I don't think it can (or should) be used for the kind of task-dependent transformation functionality that I am suggesting here.

@mb706 mb706 linked a pull request Mar 3, 2019 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
2 participants