Multi-modal Medical Diagnosis via Large-small Model Collaboration