Publication | Open Access
Learning Transferable Features with Deep Adaptation Networks
2.8K
Citations
28
References
2015
Year
Convolutional Neural NetworkEngineeringMachine LearningData ScienceFeature LearningPattern RecognitionDomain AdaptationFeature TransformationMulti-task LearningFeature TransferabilityComputer ScienceTransfer LearningTransferable FeaturesDeep LearningDeep Neural NetworkComputer Vision
Deep neural networks can learn transferable features, but transferability declines in higher layers as domain discrepancy increases. The study aims to reduce dataset bias and improve transferability in task‑specific layers. We introduce Deep Adaptation Networks, which embed task‑specific layers into a reproducing kernel Hilbert space and match domain mean embeddings using an optimal multi‑kernel selection, enabling scalable, statistically guaranteed transferability. Experiments demonstrate that DAN achieves state‑of‑the‑art error rates on standard domain adaptation benchmarks.
Recent studies reveal that a deep neural network can learn transferable features which generalize well to novel tasks for domain adaptation. However, as deep features eventually transition from general to specific along the network, the feature transferability drops significantly in higher layers with increasing domain discrepancy. Hence, it is important to formally reduce the dataset bias and enhance the transferability in task-specific layers. In this paper, we propose a new Deep Adaptation Network (DAN) architecture, which generalizes deep convolutional neural network to the domain adaptation scenario. In DAN, hidden representations of all task-specific layers are embedded in a reproducing kernel Hilbert space where the mean embeddings of different domain distributions can be explicitly matched. The domain discrepancy is further reduced using an optimal multi-kernel selection method for mean embedding matching. DAN can learn transferable features with statistical guarantees, and can scale linearly by unbiased estimate of kernel embedding. Extensive empirical evidence shows that the proposed architecture yields state-of-the-art image classification error rates on standard domain adaptation benchmarks.
| Year | Citations | |
|---|---|---|
Page 1
Page 1