My life as an educator at NUS
Ganesh
Masters & PhD from National University of Singapore (NUS)
Several years in Industry/Academia
Architect, Manager, Technology Evangelist, DevOps Lead
Talks/workshops in America, Europe, Australia, Asia
Software Engineering, Cloud Computing, Artificial
Intelligence, DevOps, Digital Humanities
Holds many industry certifications
Kathakali Dancer, Travel Vlogger, Speaker
GANESHNI YER
http://ganeshniyer.github.io
3
STeAdS
Software Engineering
and Technological
Advancements for
Society
4
https://ganeshniyer.com/
Agile
DevOps
Project Management
Software
Engineering
Cloud Computing
AI (Edge AI, ML/DL/GAN)
Web/Mobile
Technological
Advancements
Art & Culture
Healthcare
Education
Society
Analysis of Student-LLM Interaction in a
Software Engineering Project
Agrawal Naman, Ridwan Shariffdeen, Guanlin Wang, Sanka Rasnayaka, Ganesh Neelakanta Iyer
School of Computing, National University of Singapore
International Conference in Software Engineering (ICSE 2025),
LLM4Code Workshop, Canada, April 2025
6
Sanka Rasnayaka
Guanlin Wang
Ridwan Shariffdeen
Naman
International Conference in Software Engineering (ICSE 2025),
LLM4Code Workshop, Canada, April 2025
Ganesh
Background of Study
Architecture of the Static Programme Analyzer (SPA)
13-week long SE Project
126 students in teams of 6
LLM usage was encouraged
Premium accounts provided
7
Research Questions
How does ChatGPT (conversational) and GitHub Copilot (autocomplete) compare?
How does the LLM-generated code evolve across the duration (3 milestones) of the Project?
Does the student-LLM interaction lead towards positive outcomes?
8
Data
10
126 UG students across 21 teams
Project with three major deadlines
730 code snippets
62 ChatGPT conversations
582,117 lines of TOTAL code
40,482 (~7%) of the code generated with the help of LLM
Methodology
Custom tagging in the source code
Retrieving generated code snippets from
the ChatGPT platform
Capturing LLM-generated code at each of the 3 Milestones
11
Findings
Usages of LLMs across the project
730 Total Snippets
535 Snippets in the first milestone
507 Copilot Snippets in total
12
How complex is the LLM-generated code?
Halstead Effort
04
Mental effort required to understand and modify
the code.
Maximum Control Flow Graph
(CFG) Depth
03
Measures depth of nested structures within
code.
Cyclomatic Complexity
02
Measures the number of linearly independent
paths within the code.
Total Lines of Code (LOC)
01
Code verbosity
13
A.1 Complexity Analysis Result 1: Copilot vs ChatGPT
Copilot-generated code tends to be more complex, sometimes significantly exceeding typical student-generated complexity levels.
Figure 1: Density Plot for measured key metrics
14
Probability Density
A.1 Complexity Analysis: Result 1: Copilot vs ChatGPT
ChatGPT produces more concise and readable code than Copilot, requiring lower cognitive effort for comprehension.
Figure 2: Comparison of ChatGPT and Copilot Complexity Across Various Complexity Measures
15
A.2 Complexity Analysis Result 2: Iterative Refinement
Example: ChatGPT’s conversational interface enables students to iteratively refine code, making it more concise and modular.
16
In-depth Analysis of
Conversations
03
How do students interact with ChatGPT across multiple
prompts?
Do they refine code gradually, or is the most useful
version generated early in the conversation?
Code Similarity Trends over
Milestones
02
Does reliance on ChatGPT change over time?
Do students modify AI-generated code less as they
progress?
Extent of Code Modification
01
How much do students alter AI-generated code?
Do they add new functionality, simplify for readability, or
restructure it to fit project requirements?
How was the generated code used in the Project?
17
B.1 Extent of Code Modification
Students frequently increase the complexity of ChatGPT-generated code, refining and expanding it for project needs.
Figure 4: Distribution of Difference Complexity Measures between Repo and GPT Code with Log Transformed x-axis
18
Probability Density
B.2 Code Similarity Trends over Milestones: Metrics
Longest Common
Subsequence (LCS)
02
Captures the longest matching sequence
of tokens between code snippets, with
pairs above 90% similarity considered
equivalent.
Jaccard Similarity
01
Measures the overlap between two sets of
extracted structural elements using Tree-
sitter.
19
B.2 Code Similarity Trends over Milestones: Result
Students’ AI usage evolved from exploration to seamless integration, with increasing similarity scores across milestones.
Figure 5: Similarity of generated and integrated code across milestones
Prompts by teams 5 and 13, during MS1
Prompts by teams 5 and 13, during MS2 & MS3
20
B.3 In-depth Analysis of Conversations
Students actively refine ChatGPT-generated code through multiple exchanges before integrating it into their projects.
Figure 6: Mean Index of the Generated Code Most Similar to the Repository Code for Conversations of Different Lengths
21

Preview text:

My life as an educator at NUS Ganesh
• Masters & PhD from National University of Singapore (NUS)
• Several years in Industry/Academia
• Architect, Manager, Technology Evangelist, DevOps Lead
• Talks/workshops in America, Europe, Australia, Asia
• Software Engineering, Cloud Computing, Artificial
Intelligence, DevOps, Digital Humanities
• Holds many industry certifications
• Kathakali Dancer, Travel Vlogger, Speaker GANESHNIYER
http:/ ganeshniyer.github.io 3 4 • Agile Software • DevOps Engineering • Project Management STeAdS Software Engineering • Cloud Computing Technological and Technological • AI (Edge AI, ML/DL/GAN) Advancements Advancements for • Web/Mobile Society • Art & Culture Society • Healthcare • Education https://ganeshniyer.com/
Analysis of Student-LLM Interaction in a
Software Engineering Project

Agrawal Naman, Ridwan Shariffdeen, Guanlin Wang, Sanka Rasnayaka, Ganesh Neelakanta Iyer
School of Computing, National University of Singapore
International Conference in Software Engineering (ICSE 2025),
LLM4Code Workshop, Canada, April 2025 Naman
Ridwan Shariffdeen Guanlin Wang Sanka Rasnayaka Ganesh
International Conference in Software Engineering (ICSE 2025),
LLM4Code Workshop, Canada, April 2025 6 Background of Study 13-week long SE Project 126 students in teams of 6 LLM usage was encouraged
Architecture of the Static Programme Analyzer (SPA) Premium accounts provided 7 Research Questions
How does ChatGPT (conversational) and GitHub Copilot (autocomplete) compare?
How does the LLM-generated code evolve across the duration (3 milestones) of the Project?
Does the student-LLM interaction lead towards positive outcomes? 8 Data
126 UG students across 21 teams
Project with three major deadlines 730 code snippets 62 ChatGPT conversations 582,117 lines of TOTAL code
40,482 (~7%) of the code generated with the help of LLM 10 Methodology
Capturing LLM-generated code at each of the 3 Milestones
Custom tagging in the source code
Retrieving generated code snippets from the ChatGPT platform 11 Findings
Usages of LLMs across the project 730 Total Snippets
535 Snippets in the first milestone 507 Copilot Snippets in total 12
How complex is the LLM-generated code? Total Lines of Code (LOC) 01 ● Code verbosity Cyclomatic Complexity 02 ●
Measures the number of linearly independent paths within the code. Maximum Control Flow Graph 03 ●
Measures depth of nested structures within (CFG) Depth code. Halstead Effort 04 ●
Mental effort required to understand and modify the code. 13
A.1 Complexity Analysis Result 1: Copilot vs ChatGPT ity lity Dens Probabi
Figure 1: Density Plot for measured key metrics
Copilot-generated code tends to be more complex, sometimes significantly exceeding typical student-generated complexity levels. 14
A.1 Complexity Analysis: Result 1: Copilot vs ChatGPT
Figure 2: Comparison of ChatGPT and Copilot Complexity Across Various Complexity Measures
ChatGPT produces more concise and readable code than Copilot, requiring lower cognitive effort for comprehension. 15
A.2 Complexity Analysis Result 2: Iterative Refinement
Example: ChatGPT’s conversational interface enables students to iteratively refine code, making it more concise and modular. 16
How was the generated code used in the Project? Extent of Code Modification 01 ●
How much do students alter AI-generated code? ●
Do they add new functionality, simplify for readability, or
restructure it to fit project requirements? Code Similarity Trends over 02 ●
Does reliance on ChatGPT change over time? Milestones ●
Do students modify AI-generated code less as they progress? ●
How do students interact with ChatGPT across multiple In-depth Analysis of 03 prompts? Conversations ●
Do they refine code gradually, or is the most useful
version generated early in the conversation? 17
B.1 Extent of Code Modification ity lity Dens Probabi
Figure 4: Distribution of Difference Complexity Measures between Repo and GPT Code with Log Transformed x-axis
Students frequently increase the complexity of ChatGPT-generated code, refining and expanding it for project needs. 18
B.2 Code Similarity Trends over Milestones: Metrics Jaccard Similarity 01
Measures the overlap between two sets of
extracted structural elements using Tree- sitter.
Captures the longest matching sequence Longest Common 02
of tokens between code snippets, with Subsequence (LCS)
pairs above 90% similarity considered equivalent. 19
B.2 Code Similarity Trends over Milestones: Result
Prompts by teams 5 and 13, during MS1
Figure 5: Similarity of generated and integrated code across milestones
Prompts by teams 5 and 13, during MS2 & MS3
Students’ AI usage evolved from exploration to seamless integration, with increasing similarity scores across milestones. 20
B.3 In-depth Analysis of Conversations
Figure 6: Mean Index of the Generated Code Most Similar to the Repository Code for Conversations of Different Lengths
Students actively refine ChatGPT-generated code through multiple exchanges before integrating it into their projects. 21