Existing precoding schemes in amplify-and-forward (AF) multiple-input- multiple-output (MIMO) relay systems use linear precoders. In this paper, we consider a precoding scheme in which a Tomlinson-Harashima (TH) precoder (THP) is used at the source and a linear precoder at the relay. With a minimum-mean-squared-error (MMSE) receiver at destination, we propose a new joint precoders design method. Since two precoders are involved, the transceiver design, formulated as an optimization problem, is difficult to solve. To overcome the problem, we propose cascading an additional unitary precoder with the TH precoder. The unitary precoder can not only simplify the optimization problem but also improve the MMSE performance. With the specially designed unitary precoder at the source, we can then adopt the primal decomposition method to solve this problem. With the method, the original optimization problem can first be decomposed into a master and a subproblem optimization problems, and then transferred to a relay precoder optimization problem. However, the optimization is not a convex problem and the solution is not obtainable. We then propose a method being able to transfer it to a convex optimization problem. A closed-form solution can then be obtained by the Karuch-Kuhn-Tucker (KKT) conditions. Simulations show that the proposed transceiver can significantly outperform existing linear transceivers.