2025
-
[C27] FairDICE: Fairness-Driven Offline Multi-Objective Reinforcement LearningNeurIPS 2025
-
[C26] SEMDICE: Off-policy State Entropy Maximization via Stationary Distribution Correction EstimationICLR 2025
2024
-
[C23] Mitigating Covariate Shift in Behavioral Cloning via Robust Distribution Correction EstimationNeurIPS 2024
-
[C22] ROIDICE: Offline Return on Investment Maximization for Efficient Decision MakingNeurIPS 2024
-
[C25] Body Transformer: Leveraging Robot Embodiment for Policy LearningCoRL 2024
-
[C24] Kernel Metric Learning for In-Sample Off-Policy Evaluation of Deterministic RL PoliciesICLR 2024paper spotlight
2023
-
[C21] AlberDICE: Addressing Out-Of-Distribution Joint Actions in Offline Multi-Agent RL via Alternating Stationary Distribution Correction EstimationNeurIPS 2023
-
[C20] SafeDICE: Offline Safe Imitation Learning with Non-Preferred DemonstrationsNeurIPS 2023
-
[C19] Tempo Adaptation in Non-stationary Reinforcement LearningNeurIPS 2023
2022
-
[C15] LobsDICE: Offline Imitation Learning from Observation via Stationary Distribution Correction EstimationNeurIPS 2022
-
[C14] Local Metric Learning for Off-Policy Evaluation in Contextual Bandits with Continuous ActionsNeurIPS 2022
-
[C18] COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction EstimationICLR 2022
-
[C17] DemoDICE: Offline Imitation Learning with Supplementary Imperfect DemonstrationsICLR 2022
-
[C16] GPT-Critic: Offline Reinforcement Learning for End-to-End Task-Oriented Dialogue SystemsICLR 2022
2021
-
[C12,W5] OptiDICE: Offline Policy Optimization via Stationary Distribution Correction EstimationICML 2021ICLR Workshop on Never-Ending RL 2021
-
[C11] Representation Balancing Offline Model-based Reinforcement LearningICLR 2021
-
[C13] Monte-Carlo Planning and Learning with Language Action Value EstimatesICLR 2021
2020
-
[C7] Reinforcement Learning for Control with Multiple FrequenciesNeurIPS 2020
-
[C10] Batch Reinforcement Learning with Hyperparameter GradientsICML 2020
-
[C8] Monte-Carlo Tree Search in Continuous Action Spaces with Value GradientsAAAI 2020
-
[C9,W4] Bayes-Adaptive Monte-Carlo Planning and Learning for Goal-Oriented DialoguesAAAI 2020NeurIPS Workshop on Conversational AI 2019
2019
-
[C5] Trust Region Sequential Variational InferenceACML 2019
-
[C6] PyOpenDial: A Python-based Domain-Independent Toolkit for Developing Spoken Dialogue Systems with Probabilistic RulesEMNLP 2019
2018
-
[C4] Monte-Carlo Tree Search for Constrained POMDPsNeurIPS 2018
-
[W3] Monte-Carlo Tree Search for Constrained MDPsICML Workshop on Planning and Learning (PAL-18), 2018
-
[J1] Layered Behavior Modeling via Combining Descriptive and Prescriptive Approaches: a Case Study of Infantry Company EngagementIEEE Transactions on System, Man, and Cybernetics: Systems 2018
2017
-
[C3,W2] Constrained Bayesian Reinforcement Learning via Approximate Linear ProgrammingIJCAI 2017Scaling-Up Reinforcement Learning Workshop at ECML PKDD (SURL), 2017
-
[C2] Hierarchically-partitioned Gaussian Process ApproximationAISTATS 2017