Introduction
This blog is for students who are interested in deep learning, computer vision, image processing, and related fields. The insights come from a detailed discussion with Mr. Dhruv, who has an extensive academic background—Bachelor of Technology in Electronics and Communication from NIT Durgapur, a Master’s in Biomedical Engineering from a Jodhpur-based university, and is currently a Ph.D. candidate at IIT Bombay, focusing on medical image analysis. He has worked on various predictive models for segmentation and classification of medical images, including radiological and histopathological images. The thoughts shared here aim to guide students through the subjects to study, how to manage research projects, reading research papers effectively, selecting practical projects, handling the pressures of a Ph.D., and even how these skills translate into real-world applications.
Educational Journey and Core Subjects
Mr. Dhruv started with a B.Tech in Electronics and Communication. After that, he appeared for the GATE examination and got into a Master’s program in Biomedical Engineering. Following his Master’s, he joined IIT Bombay as a Ph.D. candidate, focusing on computer vision for medical image analysis. He explained that his work involves building machine learning models—segmentation models, classification models—for various types of medical images, including MRI scans and histopathological slides.
He highlighted that to be effective in the domain of deep learning, computer vision, and image processing, students should build a strong foundation in certain subjects. He mentioned four critical areas:
-
Deep Learning and Machine Learning:
Understanding neural networks and deep architectures is essential. He frequently uses models like ResNet (ResNet-18, ResNet-34, ResNet-51) and sometimes transformer-based models, depending on the problem’s complexity. Deep learning knowledge is a must because his current work relies heavily on these methods for analyzing images.
-
Image Processing and Computer Vision:
Knowledge of image processing is necessary to handle tasks like segmentation, classification, and dealing with medical images. He works with MRI data, detecting tumors, infarct regions (like after a stroke), and other subtle features. Image processing helps prepare images, handle augmentations, and understand pixel-level manipulations.
-
Digital Signal Processing (DSP):
While not all concepts from DSP are needed, a part of it forms the basis for understanding digital image processing. He mentioned having sufficient background in DSP helps in understanding how to manipulate and analyze signals, and this knowledge transfers naturally to images, since images are also signals but in two dimensions.
-
Probability and Statistics:
Basics of probability and statistics are crucial. Loss functions, which are central to training deep learning models, often come from statistical concepts. Understanding probability distributions, statistical inference, and related ideas helps in interpreting model outputs and selecting the right metrics.
He emphasized that these four areas—deep learning, image processing, DSP (to some extent), and probability/statistics—are interconnected. For example, digital image processing often builds on DSP fundamentals, and probability/statistics underlie the selection of appropriate loss functions and evaluation criteria.
Specific Topics and Daily Work in Research
In his research lab at IIT Bombay, the focus often involves applying computer vision to medical imaging. Projects can range from testing new models on natural images for general computer vision insights to working on application-based tasks like tumor segmentation in MRI scans. For instance, when someone has a stroke, MRI images may show water accumulation or infarct areas that need to be segmented. He uses supervised learning for many tasks, but sometimes employs self-supervised or semi-supervised learning if the dataset is large but not fully annotated.
He chooses models based on the task and performance requirements. Classical CNN-based models like ResNet can be tried first. If the problem demands it, he may consider transformer-based models, but these usually require more computational resources. He also stressed that depending on the problem, one might need to try various algorithms. For example, if the dataset is partially labeled, semi-supervised methods are a good fit.
Tools and Libraries for Implementation
For implementing these models, Mr. Dhruv mentioned that Python is the primary language because it has numerous libraries that simplify the work:
-
PyTorch: For building and training deep learning models.
-
NumPy, Pandas: For handling data, arrays, and basic manipulations.
-
scikit-learn (sklearn): For traditional machine learning tasks and preprocessing.
-
OpenCV: For a wide range of image processing operations.
-
PyDICOM (for medical images), pyradiomics, and other specialized libraries: Helpful in reading and analyzing medical image formats.
He said that many of these libraries reduce the burden of coding from scratch, allowing researchers to focus on experiments and improving the performance of models rather than dealing with low-level implementations.
Project Suggestions and Handling Datasets
When advising students on projects, Mr. Dhruv noted that the main challenge is often data. One should start with publicly available datasets. The complexity of the project depends on the time you have—maybe six months or a year. He recommends picking a dataset, setting a goal (like segmenting a certain structure in images), and then progressively improving the model:
-
Begin with straightforward supervised learning if annotations are available.
-
If you have a large dataset but fewer labels, consider self-supervised or semi-supervised techniques.
-
Try different loss functions, model architectures, and augmentations.
-
Keep track of performance improvements over time.
He explained that making a project interesting involves breaking it down into smaller tasks and refining parameters step-by-step. Some advanced ideas include tackling domain generalization (when images come from multiple sources and the model’s performance varies) or handling noisy labels (data that may have imperfect annotations). All these improvements and explorations enhance your understanding of the field and prepare you for research-like thinking.
Working Methodically: Literature Review, Comparison, and Novelty
For a Ph.D. or a long-term research project, Mr. Dhruv outlined a clear methodology:
-
Identify a Broad Area: Start by deciding on the domain—e.g., medical image segmentation.
-
Literature Review: Read recent papers to understand the state-of-the-art methods. Pick a well-known public dataset, such as the BraTS dataset for brain tumor segmentation, and see what others have done.
-
Reproduce Results: Implement at least one existing method from the literature. Reproduction builds your understanding and confidence.
-
Compare Methods and Metrics: Once you have a baseline method, compare it with other techniques. Keep track of performance metrics like Dice coefficient, accuracy, or IoU.
-
Incorporate Novelty: Innovate by tweaking the loss functions, adding a new regularization term, or modifying the model architecture. Log every experiment, note down what worked and what did not, and analyze why.
By following these steps, you gradually evolve from just implementing existing methods to contributing new ideas that can lead to publications.
Reading Research Papers Efficiently
Mr. Dhruv admitted that reading research papers can be challenging. At the beginning, it’s fine to read them in detail—introduction, methodology, experiments, results, and conclusion. Once you know the field’s common language and standard approaches, you can read more strategically:
-
Start with the abstract and conclusion to check if the paper is relevant.
-
Look at results and discussions to understand the paper’s main contribution.
-
If it seems useful, then dive deeper into the methodology.
-
Since after reading a few papers, introductions might repeat similar background details, you can skip those once you are familiar with the basic concepts.
This approach saves time and helps you focus on the novelty of each paper rather than re-reading standard background material.
Planning, Journaling, and Time Management
Mr. Dhruv strongly recommends maintaining a research journal. At the start of a project, define your objective and break it down into smaller tasks. Assign timelines: for example, spend one week on data preprocessing, another week on training a baseline model, etc. This gives structure and reduces stress. If something takes more time than expected, adjust the plan. If something finishes early, you have spare time to explore more.
The key is to keep track of what you do each week, note down findings from experiments, and write summaries after reading a research paper. This record-keeping helps when you want to publish papers or write your thesis since you won’t have to rely only on memory.
Managing Stress and Seeking Help
A Ph.D. or long-term research can be stressful. Experiments might fail, or progress may feel slow. To handle this, Mr. Dhruv suggested a few practical steps:
-
Engage in sports or hobbies (like playing football) to clear your mind and maintain a positive mindset.
-
Communicate with your research advisor or guide. Whenever he felt stuck, he would talk to his guide. Explaining your problem to someone else often brings clarity.
-
Talk to peers and lab mates. Understanding that everyone faces difficulties in research helps you realize it’s normal and not a sign of personal failure.
-
Focus on subjective improvements: even if a particular experiment fails, you still learned something. Research is about learning from every attempt, successful or not.
Experience with Industry and Practical Applications
Mr. Dhruv also has insights into how research translates into real-world scenarios. He mentioned internships and discussions with seniors working in the industry. Through these experiences, he learned how to deploy models and understand the practical constraints—such as computational limitations, reliability, and user-friendliness.
He also noted the possibility of starting a company or a product line based on the skills and technologies developed during research. For example, one of his seniors is working on a wearable device that takes ECG signals and analyzes them automatically. Casual discussions and idea-sharing made him realize how the methodologies learned during his Ph.D. can directly contribute to building products and solutions that benefit society.
Emotional Well-Being and Long-Term Growth
According to Mr. Dhruv, the journey of doing a Ph.D. teaches you not only about technical subjects but also about problem-solving in life. Research trains your mind to think analytically, handle uncertainties, and be comfortable with not always having a direct answer. These traits become invaluable beyond the lab—whether you go into industry, academia, or entrepreneurship. The ability to break down complex problems, understand data, and systematically test solutions is universally useful.
He repeated that following a plan, keeping a journal, and adjusting timelines helps reduce anxiety. Reading research papers strategically and communicating openly with advisors and peers keeps you engaged and less isolated. Playing sports and maintaining hobbies contributes to your mental health. All these steps together ensure you can enjoy the research process and grow both intellectually and personally.
Conclusion
The points shared by Mr. Dhruv cover a wide range of topics: choosing the right subjects (deep learning, image processing, DSP basics, probability and statistics), using tools like Python and its libraries, working step-by-step from literature review to implementing methods and introducing novelty, reading research papers efficiently, planning projects with proper timelines, maintaining a research journal, managing stress through hobbies and communication, and finally connecting research with real-world applications.
He showed that each step of the journey, from deciding on a dataset to publishing research papers, involves careful thought and consistent effort. By following these guidelines, students can navigate the complex world of deep learning, computer vision, and medical image analysis more confidently and find their own path to success.